module remote.ssh_remote_connection
¶
Short summary¶
module pyenbc.remote.ssh_remote_connection
A class to help connect with a remote machine and send command line.
Classes¶
class |
truncated documentation |
---|---|
A simple class to access to remote machine through SSH. It requires modules paramiko, … |
Static Methods¶
staticmethod |
truncated documentation |
---|---|
Returns a function which converts an :epkg:`ANSI` string into a different format. |
|
builds a string for |
|
parses the output of a command ls |
Methods¶
method |
truncated documentation |
---|---|
constructor |
|
usual |
|
close the connection |
|
close a session |
|
connect |
|
tells if a file exists on the cluster |
|
return the content of a folder on the cluster as a DataFrame |
|
creates a directory on the cluster |
|
removes a file on the cluster |
|
download a file from the remote machine (not on the cluster) |
|
download a file directly from the cluster to the local machine |
|
execute a command line, it raises an error if there is an error |
|
tells if a file exists on the bridge |
|
submits a PIG script, it first upload the script to the default folder and submit it |
|
return the content of a folder on the bridge as a DataFrame |
|
Opens a session with method invoke_shell. … |
|
Submits a :epkg:`PIG` script, it first upload the script to the default folder and submits it. |
|
Send something through a session, the function is supposed to return when the execute of the given command is done, … |
|
upload a file to the remote machine (not on the cluster) |
|
the function directly uploads the file to the cluster, it first goes to the bridge, uploads it to the cluster and … |
Documentation¶
A class to help connect with a remote machine and send command line.
- class pyenbc.remote.ssh_remote_connection.ASSHClient(server, username, password)¶
Bases:
object
A simple class to access to remote machine through SSH. It requires modules paramiko, pycrypto, ecdsa.
This class is used in magic command
remote_open
. On Windows, the installation of pycrypto can be tricky. See Pycrypto on Windows. Those modules are part of the Anaconda distribution.constructor
- Parameters:
server – server
username – username
password – password
- __init__(server, username, password)¶
constructor
- Parameters:
server – server
username – username
password – password
- __str__()¶
usual
- _allowed_form = {None: None, 'plain': None, 'html': None}¶
- static _get_out_format(format)¶
Returns a function which converts an :epkg:`ANSI` string into a different format.
- Parameters:
format – string
- Returns:
function
- static build_command_line_parameters(params, command_name='-param')¶
builds a string for
pig
based on the parameters in params- Parameters:
params – dictionary
command_name –
-param
or-hiveconf
- Returns:
string
New in version 1.1.
- close()¶
close the connection
- close_session()¶
close a session
- connect()¶
- dfs_exists(path)¶
tells if a file exists on the cluster
- Parameters:
path – path
- Returns:
boolean
New in version 1.1.
- dfs_ls(path)¶
return the content of a folder on the cluster as a DataFrame
- Parameters:
path – path on the cluster
- Returns:
DataFrame
New in version 1.1.
- dfs_mkdir(path)¶
creates a directory on the cluster
- Parameters:
path – path
New in version 1.1.
- dfs_rm(path, recursive=False)¶
removes a file on the cluster
- Parameters:
path – path
recursive – boolean
New in version 1.1.
- download(remotepath, localpath)¶
download a file from the remote machine (not on the cluster)
- Parameters:
localpath – local file
remotepath – remote file (it can be a list, localpath is a folder in that case)
Changed in version 1.1: remotepath can be a list of paths
- download_cluster(remotepath, localpath, merge=False)¶
download a file directly from the cluster to the local machine
- Parameters:
localpath – local file
remotepath – remote file (it can be a list, localpath is a folder in that case)
merge – True to use getmerge instead of get
New in version 1.1.
- execute_command(command, no_exception=False, fill_stdin=None)¶
execute a command line, it raises an error if there is an error
- Parameters:
command – command
no_exception – if True, do not raise any exception
fill_stdin – data to send on the stdin input
- Returns:
stdout, stderr
Example of commands:
ssh.execute_command("ls") ssh.execute_command("hdfs dfs -ls")
- exists(path)¶
tells if a file exists on the bridge
- Parameters:
path – path
- Returns:
boolean
New in version 1.1.
- hive_submit(hive_file_or_query, params=None, redirection='redirection.hive', no_exception=True, fLOG=<function noLOG>)¶
submits a PIG script, it first upload the script to the default folder and submit it
- Parameters:
hive_file_or_query – pig script (local)
params – parameters to send to the job
redirection – string empty or not
no_exception – sent to
execute_command
fLOG – logging function
- Returns:
out, err from
execute_command
If redirection is not empty, the job is submitted but the function returns after the standard output and error were redirected to
redirection.hive.out
andredirection.hive.err
.The function executes the command line:
hive -f <filename>
Or:
hive -e <query>
With redirection:
hive -execute -f <filename> 2> redirection.hive.err 1> redirection.hive.out &
If there is no redirection, the function waits and return the output.
Submit a HIVE query
client = ASSHClient() hive_sql = ''' DROP TABLE IF EXISTS bikes20; CREATE TABLE bikes20 (sjson STRING); LOAD DATA INPATH "/user/__USERNAME__/unittest2/paris*.txt" INTO TABLE bikes20; SELECT * FROM bikes20 LIMIT 10; '''.replace("__USERNAME__", self.client.username) out,err = client.hive_submit(hive_sql, redirection=None)
New in version 1.1.
- ls(path)¶
return the content of a folder on the bridge as a DataFrame
- Parameters:
path – path on the bridge
- Returns:
DataFrame
New in version 1.1.
- open_session(no_exception=False, timeout=1.0, add_eol=True, prompts=('~$', '>>>'), out_format=None)¶
Opens a session with method invoke_shell.
- Parameters:
no_exception – if True, do not raise any exception in case of error
timeout – timeout in s
add_eol – if True, the function will add a EOL to the sent command if it does not have one
prompts – if function terminates if the output ends by one of those strings.
out_format – None, plain, html
How to open a remote shell?
ssh = ASSHClient( "<server>", "<login>", "<password>") ssh.connect() out = ssh.send_recv_session("ls") print( ssh.send_recv_session("python") ) print( ssh.send_recv_session("print('3')") ) print( ssh.send_recv_session("import sys\nsys.executable") ) print( ssh.send_recv_session("sys.exit()") ) print( ssh.send_recv_session(None) ) ssh.close_session() ssh.close()
The notebook Communication with a remote Linux machine through SSH illustrates the output of these instructions.
- static parse_lsout(out, local_schema=True)¶
parses the output of a command ls
- Parameters:
out – output
local_schema – schema for the bridge or the cluster (False)
- Returns:
DataFrame
New in version 1.1.
- pig_submit(pig_file, dependencies=None, params=None, redirection='redirection.pig', local=False, stop_on_failure=False, check=False, no_exception=True, fLOG=<function noLOG>)¶
Submits a :epkg:`PIG` script, it first upload the script to the default folder and submits it.
- Parameters:
pig_file – pig script (local)
dependencies – others files to upload (still in the default folder)
params – parameters to send to the job
redirection – string empty or not
local – local run or not (option -x local) (in that case, redirection will be empty)
stop_on_failure – if True, add option
-stop_on_failure
on the command linecheck – if True, add option
-check
(in that case, redirection will be empty)no_exception – sent to
execute_command
fLOG – logging function
- Returns:
out, err from
execute_command
If redirection is not empty, the job is submitted but the function returns after the standard output and error were redirected to
redirection.out
andredirection.err
.The first file will contain the results of commands DESCRIBE DUMP, EXPLAIN. The standard error receives logs and exceptions.
The function executes the command line:
pig -execute -f <filename>
With redirection:
pig -execute -f <filename> 2> redirection.pig.err 1> redirection.pig.out &
New in version 1.1.
- send_recv_session(fillin)¶
Send something through a session, the function is supposed to return when the execute of the given command is done, but this is quite difficult to detect without knowing what exactly was send.
So we add a timeout just to tell the function it has to return even if nothing tells the command has finished. It fillin is None, the function will just listen to the output.
- Parameters:
fillin – sent to stdin
- Returns:
stdout
The output contains escape codes. They can be converted to plain text or HTML by using the module ansiconv and ansi2html. This can be specified when opening the session.
- upload(localpath, remotepath)¶
upload a file to the remote machine (not on the cluster)
- Parameters:
localpath – local file (or a list of files)
remotepath – remote file
Changed in version 1.1: it can upload multiple files if localpath is a list
- upload_cluster(localpath, remotepath)¶
the function directly uploads the file to the cluster, it first goes to the bridge, uploads it to the cluster and deletes it from the bridge
- Parameters:
localpath – local filename (or list of files)
remotepath – path to the cluster
- Returns:
filename
New in version 1.1.