| BML-S | Help Login

Last entries

    20170724.1750
  1. Hive server Beeline not working - [s]

    org.apache.hive.service.cli.HiveSQLException: Error while processing statement: hive configuration hive.supports.subdirectories does not exists. If this issue started happening after upgrading older than 2.4.2 to 2.5.x, this issue would be caused by the change HIVE-11582, which resulted in the removal of subdirectory parameter by default in hive.security.authorization.sqlstd.confwhitelist. To resolve this issue hive.security.authorization.sqlstd.confwhitelist.append=hive\.mapred\.supports\.subdirectories

  2. 20170721.0842
  3. spark|zeppelin|livy > How to get Spark Interpreter working with custom python configuration - [s]

    Ambari => Spark => Custom livy-conf livy.spark.master=yarn-cluster # didn't work Ambari => Zeppelin Notebook => Advanced zeppelin-env # Pyspark (supported with Spark 1.2.1 and above) # To configure pyspark, you need to set spark distribution's path to 'spark.home' property in Interpreter setting screen in Zeppelin GUI # path to the python command. must be the same path on the driver(Zeppelin) and all workers. # export PYSPARK_PYTHON export PYSPARK_DRIVER_PYTHON="/opt/Python-2.7.6/python" export PYSPARK_PYTHON="/opt/Python-2.7.6/python" # didn't work livy.spark.executorEnv.PYSPARK_PYTHON=/opt/Python-2.7.6/python livy.spark.yarn.appMasterEnv.PYSPARK_PYTHON=/opt/Python-2.7.6/python # didn't work %livy.pyspark import os,sys os.environ["PYSPARK_PYTHON"]="/opt/Python-2.7.6/python" os.environ["PYSPARK_DRIVER_PYTHON"]="/opt/Python-2.7.6/python" print(sys.version) # Seems to work? At least yarn app log sets env variable "export PYSPARK_PYTHON="/opt/Python-2.7.6/python"" %livy.pyspark #dir() conf.set('spark.yarn.appMasterEnv.PYSPARK_PYTHON', '/opt/Python-2.7.6/python') conf.set('spark.yarn.appMasterEnv.PYSPARK_DRIVER_PYTHON', '/opt/Python-2.7.6/python') Also, changing spark-env template works (of course) Ref: https://issues.apache.org/jira/browse/ZEPPELIN-2195 LIVY-159 SPARK-13081 SPARK-16110 http://spark.apache.org/docs/latest/configuration.html#available-properties Note: When running Spark on YARN in cluster mode, environment variables need to be set using the spark.yarn.appMasterEnv.[EnvironmentVariableName] property in your conf/spark-defaults.conf file. Environment variables that are set in spark-env.sh will not be reflected in the YARN Application Master process in cluster mode. See the YARN-related Spark Properties for more information.

  4. 20170720.0712
  5. How to enable user impersonation for JDBC interpreter in Zeppelin with Kerberos disabled - Hortonworks - [s]

  6. 20170719.1819
  7. hadoop|core-site.xml| Mapping Kerberos Principals to Short Names with auth_to_local - [s]

    The DEFAULT setting strips the @REALM portion from the Kerberos principal, where REALM is the Kerberos realm defined by the default_realm setting in the NameNode krb5.conf file.

  8. 20170717.0757
  9. [HDFS-8708] DFSClient should ignore dfs.client.retry.policy.enabled for HA proxies - ASF JIRA - [s]

    dfs.client.retry.policy.enabled=false for HA

  10. 20170712.0908
  11. ambari > how to Add a host|node and deploy a component using API - [s]

  12. 20170710.1454
  13. Configuring SolrCloud - Hortonworks Data Platform - [s]

    yum install lucidworks-hdpsearch # or let setup script to install by setting: SOLR_INSTALL=true SOLR_DOWNLOAD_URL=http://archive.apache.org/dist/lucene/solr/5.5.4/solr-5.5.4.tgz --------------------------------------------------- [root@sandbox ranger_audits]# su - solr [solr@sandbox ~]$ bash -x /opt/solr/ranger_audit_server/scripts/add_ranger_audits_conf_to_zk.sh + JAVA_HOME=/usr/lib/jvm/java + SOLR_USER=solr + SOLR_ZK=sandbox.hortonworks.com:2181/ranger_audits + SOLR_INSTALL_DIR=/opt/solr + SOLR_RANGER_HOME=/opt/solr/ranger_audit_server ++ whoami + '[' solr '!=' solr ']' + '[' sandbox.hortonworks.com:2181/ranger_audits = '' ']' + '[' /opt/solr = '' ']' + '[' /opt/solr/ranger_audit_server = '' ']' + SOLR_RANGER_CONFIG_NAME=ranger_audits + SOLR_RANGER_CONFIG_LOCAL_PATH=/opt/solr/ranger_audit_server/conf + ZK_CLI=/opt/solr/server/scripts/cloud-scripts/zkcli.sh + '[' '!' -x /opt/solr/server/scripts/cloud-scripts/zkcli.sh ']' + set -x + /opt/solr/server/scripts/cloud-scripts/zkcli.sh -cmd upconfig -zkhost sandbox.hortonworks.com:2181/ranger_audits -confname ranger_audits -confdir /opt/solr/ranger_audit_server/conf [solr@sandbox ~]$ /opt/solr/ranger_audit_server/scripts/start_solr.sh Waiting up to 30 seconds to see Solr running on port 6083 [/] Started Solr server on port 6083 (pid=27237). Happy searching! [solr@sandbox ~]$ [solr@sandbox ~]$ bash -x /opt/solr/ranger_audit_server/scripts/create_ranger_audits_collection.sh + SOLR_HOST_URL=http://sandbox.hortonworks.com:6083 + SOLR_ZK=sandbox.hortonworks.com:2181/ranger_audits + SOLR_INSTALL_DIR=/opt/solr + SHARDS=1 + REPLICATION=1 + CONF_NAME=ranger_audits + COLLECTION_NAME=ranger_audits + which curl + '[' 0 -ne 0 ']' + set -x + curl --negotiate -u : 'http://sandbox.hortonworks.com:6083/solr/admin/collections?action=CREATE&name=ranger_audits&numShards=1&replicationFactor=1&collection.configName=ranger_audits&maxShardsPerNode=100' 0462303160ranger_audits_shard1_replica1

  14. 20170710.1342
  15. Apache Hadoop 2.7.3 – YARN Application Security - [s]

  16. 20170708.0750
  17. hive|metastore > export|extract table DDL (all tables) - [s]

    mysql -u hive -p -e "select concat( 'show create table ' , T.NAME , '.', T.TBL_NAME,';') from (select DBS.NAME, TBLS.TBL_NAME from TBLS left join DBS on TBLS.DB_ID = DBS.DB_ID) T" hive > /tmp/file.ddl ##remove header in file.sql hive -f /tmp/file.ddl > tmp/create_table.ddl

  18. 20170708.0733
  19. hive > Query|search Metastore database|table with -executeJDOQL for stats|statistics - [s]

    HIVE_CONF_DIR=/etc/hive/conf/conf.server/ hive --service metatool -executeJDOQL 'select dbName+"."+tableName+"::"+colName+"="+numDVs from org.apache.hadoop.hive.metastore.model.MTableColumnStatistics'; HIVE_CONF_DIR=/etc/hive/conf/conf.server/ hive --service metatool -executeJDOQL 'select dbName+"."+tableName+"("+partitionName+")::"+colName+"="+numDVs from org.apache.hadoop.hive.metastore.model.MPartitionColumnStatistics';   Ref: https://db.apache.org/jdo/jdoql.html https://github.com/apache/hive/blob/master/metastore/src/model/package.jdo

  20. 20170707.1610
  21. hive|llap > Optimizing an Apache Hive Data Warehouse - [s]

  22. 20170705.1656
  23. hbase > Performance testing a HBase cluster (hbase pe) - [s]

    20170725.0919
  1. linux > vmstat -d - [s]

    FIELD DESCRIPTION FOR DISK MODE    Reads        total: Total reads completed successfully        merged: grouped reads (resulting in one I/O)        sectors: Sectors read successfully        ms: milliseconds spent reading    Writes        total: Total writes completed successfully        merged: grouped writes (resulting in one I/O)        sectors: Sectors written successfully        ms: milliseconds spent writing    IO        cur: I/O in progress        s: seconds spent for I/O

  2. 20170721.0847
  3. Python with Visual Studio Code - [s]

  4. 20170721.0845
  5. VS Code (Visual Studio) setup for Python debug - [s]

  6. 20170718.1452
  7. Hive > read explain plan (Tez) - [s]

  8. 20170718.1200
  9. mysql > How to shrink/purge ibdata1 file in MySQL (data/database) - [s]

  10. 20170718.0907
  11. java9 > [JDK-8176361] jdk.net.hosts.file - [s]

  12. 20170718.0851
  13. java > InetAddress Caching - [s]

    The InetAddress class has a cache to store successful as well as unsuccessful host name resolutions. By default, when a security manager is installed, in order to protect against DNS spoofing attacks, the result of positive host name resolutions are cached forever. When a security manager is not installed, the default behavior is to cache entries for a finite (implementation dependent) period of time. The result of unsuccessful host name resolution is cached for a very short period of time (10 seconds) to improve performance. If the default behavior is not desired, then a Java security property can be set to a different Time-to-live (TTL) value for positive caching. Likewise, a system admin can configure a different negative caching TTL value when needed. Two Java security properties control the TTL values used for positive and negative host name resolution caching: networkaddress.cache.ttl Indicates the caching policy for successful name lookups from the name service. The value is specified as as integer to indicate the number of seconds to cache the successful lookup. The default setting is to cache for an implementation specific period of time. A value of -1 indicates "cache forever". networkaddress.cache.negative.ttl (default: 10) Indicates the caching policy for un-successful name lookups from the name service. The value is specified as as integer to indicate the number of seconds to cache the failure for un-successful lookups. A value of 0 indicates "never cache". A value of -1 indicates "cache forever".

  14. 20170718.0703
  15. Is it possible to filter request method in chrome developer tools? - [s]

    -method:GET -persist

  16. 20170717.0804
  17. Kerberos > recreate|re-create KDC REALM - [s]

  18. 20170712.1501
  19. linux|bash > how to find the gateway used for routing (route) - [s]

    [root@node4 ~]# ip route get 10.1.0.6 10.1.0.6 via 172.17.0.1 dev eth0 src 172.17.100.4 cache

  20. 20170712.1433
  21. windows server > Install Telnet Client - [s]

    pkgmgr /iu:"TelnetClient"

  22. 20170712.1138
  23. java > SYS TIME GREATER THAN USER TIME – GC Easy - [s]

  24. 20170710.2239
  25. Migrating Audit Logs from DB to Solr in Ambari Clusters - Hortonworks Data Platform - [s]

    With HDP 2.5.5.0, ranger.jpa.audit.jdbc.xxx does not work #ranger.jpa.audit.jdbc.driver=org.postgresql.Driver #ranger.jpa.audit.jdbc.dialect=org.eclipse.persistence.platform.database.PostgreSQLPlatform ranger.jpa.audit.jdbc.url=jdbc:postgresql://node1.localdomain:5432/ranger_audit #ranger.jpa.audit.jdbc.user=rangeradmin #ranger.jpa.audit.jdbc.password=hadoop Kinit does not required #kinit -kt /etc/security/keytabs/rangeradmin.service.keytab rangeradmin/`hostname -f` cd /usr/hdp/current/ranger-admin # user solrcloud so that ambari creates ranger_solr_jaas.conf or java -Dsun.security.krb5.debug=true -Dsun.security.spnego.debug=true -Djava.security.debug=gssloginconfig,configfile,configparser,logincontext \ -Djava.security.auth.login.config=./conf/ranger_solr_jaas.conf \ -Dlogdir=ews/logs -Dlog4j.configuration=db_patch.log4j.xml -cp ews/webapp/WEB-INF/classes/conf:ews/webapp/WEB-INF/classes/lib/*:ews/webapp/WEB-INF/:ews/webapp/META-INF/:ews/webapp/WEB-INF/lib/*:ews/webapp/WEB-INF/classes/:ews/webapp/WEB-INF/classes/META-INF:/usr/share/java/postgresql-9.3-1101-jdbc4.jar:/usr/share/java/mysql-connector-java.jar org.apache.ranger.patch.cliutil.DbToSolrMigrationUtil cat ./conf/ranger_solr_jaas.conf # somehow useTicketCache=true doesn't work Client { com.sun.security.auth.module.Krb5LoginModule required useKeyTab=true storeKey=true useTicketCache=false keyTab="/etc/security/keytabs/rangeradmin.service.keytab" principal="rangeradmin/node3.localdomain@EXAMPLE.COM"; }; diff -u ./ews/webapp/WEB-INF/db_patch.log4j.xml ./ews/webapp/WEB-INF/db_patch.log4j.xml.orig --- ./ews/webapp/WEB-INF/db_patch.log4j.xml 2017-07-10 10:51:40.978774284 +0000 +++ ./ews/webapp/WEB-INF/db_patch.log4j.xml.orig 2017-04-21 03:45:02.000000000 +0000 @@ -70,17 +70,17 @@ </category> <category name="org.apache.ranger" additivity="false"> - <priority value="debug" /> + <priority value="info" /> <appender-ref ref="xa_log_appender" /> </category> <category name="xa" additivity="false"> - <priority value="debug" /> + <priority value="info" /> <appender-ref ref="xa_log_appender" /> </category> <root> - <priority value="debug" /> + <priority value="warn" /> <appender-ref ref="xa_log_appender" /> </root>

  26. 20170627.0711
  27. Java > serialize/serialization object vulnerability in commons collection library - [s]

    What Do WebLogic, WebSphere, JBoss, Jenkins, OpenNMS, and Your Application Have in Common? This Vulnerability. There are two separate Java objects in the above screenshot. One is base64 encoded and can be seen in the rightmost column beginning with “rO0AB”. The other is raw binary going over the wire, so we’ll have to look at the hex column in the middle. It begins with the bytes “ac ed 00 05 73 72”.

[s] hadoop (12)
[s] public (14)