ambari坑记录

问题一:
此问题是在安装注册主机时遇到的

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
('ERROR 2015-02-06 20:09:43,441 NetUtil.py:56 - [Errno 1] _ssl.c:492: error:100AE081:elliptic curve routines:EC_GROUP_new_by_curve_name:unknown group
ERROR 2015-02-06 20:09:43,442 NetUtil.py:58 - SSLError: Failed to connect. Please check openssl library versions.
Refer to: https://bugzilla.redhat.com/show_bug.cgi?id=1022468 for more details.
INFO 2015-02-06 20:09:43,442 NetUtil.py:81 - Server at https://Master.domain.dev:8440 is not reachable, sleeping for 10 seconds...
INFO 2015-02-06 20:09:45,343 main.py:83 - loglevel=logging.INFO
INFO 2015-02-06 20:09:45,343 main.py:55 - signal received, exiting.
INFO 2015-02-06 20:09:45,343 ProcessHelper.py:39 - Removing pid file
INFO 2015-02-06 20:09:45,344 ProcessHelper.py:46 - Removing temp files
INFO 2015-02-06 20:10:19,815 main.py:83 - loglevel=logging.INFO
INFO 2015-02-06 20:10:19,816 DataCleaner.py:36 - Data cleanup thread started
INFO 2015-02-06 20:10:19,816 DataCleaner.py:71 - Data cleanup started
INFO 2015-02-06 20:10:19,816 DataCleaner.py:73 - Data cleanup finished
INFO 2015-02-06 20:10:19,973 PingPortListener.py:51 - Ping port listener started on port: 8670
INFO 2015-02-06 20:10:19,973 main.py:227 - Connecting to the server at: https://Master.domain.dev:8440
INFO 2015-02-06 20:10:19,974 NetUtil.py:72 - DEBUG: Trying to connect to the server at https://Master.domain.dev:8440
INFO 2015-02-06 20:10:19,974 NetUtil.py:42 - Connecting to the following url https://Master.domain.dev:8440/cert/ca
ERROR 2015-02-06 20:10:20,023 NetUtil.py:56 - [Errno 1] _ssl.c:492: error:100AE081:elliptic curve routines:EC_GROUP_new_by_curve_name:unknown group
ERROR 2015-02-06 20:10:20,023 NetUtil.py:58 - SSLError: Failed to connect. Please check openssl library versions.
Refer to: https://bugzilla.redhat.com/show_bug.cgi?id=1022468 for more details.
INFO 2015-02-06 20:10:20,023 NetUtil.py:81 - Server at https://Master.domain.dev:8440 is not reachable, sleeping for 10 seconds...
', None)

Connection to Slave5.domain.dev closed.
SSH command execution finished
host=Slave5.domain.dev, exitcode=0
Command end time 2015-02-06 12:07:47

Registering with the server...
Registration with the server failed.

解决方法:
一下是官方论坛中给出的方法,很好

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
Platforms:

RHEL / CentOS 6.5
Ambari 1.4 or later
Root Cause: The OpenSSL library available and installed by default on RHEL/CentOS 6.5 has a bug. Refer to https://bugzilla.redhat.com/show_bug.cgi?id=1025598 for detailed information on the bug.

Remedy:

Check the OpenSSL library version installed on your host(s):
rpm -qa | grep openssl

openssl-1.0.1e-15.el6.x86_64

If the output says openssl-1.0.1e-15.x86_64 (1.0.1 build 15) you will need to upgrade the OpenSSL library by running the following command:
yum upgrade openssl

Verify you have the newer version of OpenSSL (1.0.1 build 16):
rpm -qa | grep openssl

openssl-1.0.1e-16.el6.x86_64

Restart Ambari Agent(s) and Click Retry Failed�09 on the Wizard

问题二

1
org.apache.hadoop.security.AccessControlException: org.apache.hadoop.security.AccessControlException: Permission denied: user=Administrator, access=WRITE, inode="hadoop": hadoop:supergroup:rwxr-xr-x

其实这个错误的原因很容易看出来,用户Administator在hadoop上执行写操作时被权限系统拒绝.
dfs.permissions修改为False

问题三
启动namenode报错

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
Traceback (most recent call last):
File "/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/services/HDFS/package/scripts/namenode.py", line 134, in <module>
NameNode().execute()
File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 123, in execute
method(env)
File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 233, in restart
self.start(env)
File "/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/services/HDFS/package/scripts/namenode.py", line 46, in start
namenode(action="start")
File "/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/services/HDFS/package/scripts/hdfs_namenode.py", line 60, in namenode
only_if=dfs_check_nn_status_cmd #skip when HA not active
File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 148, in __init__
self.env.run()
File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 149, in run
self.run_action(resource, action)
File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 115, in run_action
provider_action()
File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", line 237, in action_run
path=self.resource.path)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 36, in checked_call
return _call(command, logoutput, True, cwd, env, preexec_fn, user, wait_for_finish, timeout, path)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 101, in _call
err_msg = Logger.get_protected_text(("Execution of '%s' returned %d. %s") % (command, code, out))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe6 in position 116: ordinal not in range(128)

在你python的安装目录下的Lib目录,找到site.py,
修改def setencoding()方法

1
2
3
def setencoding():
..... .... if 0:
#Enabletosupportlocaleawaredefaultstringencodings.

把那个if 0改为if 1


ambari坑记录
https://www.920929.xyz/posts/9a1ba29f.html
作者
DELIN
发布于
2016年10月14日
许可协议