IllustriousFrosting2
Posts: 4
Joined: Sat May 02, 2020 12:27 pm

Use large number of ports in Python to speed file transfer?

Tue May 05, 2020 2:59 pm

Hello!

I am currently using Python to transfer a fairly large (~400MB-1GB) file directory over a network. Using just one port right now I predict it would take 45 minutes to complete the entire transaction. Would it be unreasonable to open a hundred or so ports to get the job done in about half a minute?

Thank you!!

jahboater
Posts: 5825
Joined: Wed Feb 04, 2015 6:38 pm
Location: West Dorset

Re: Use large number of ports in Python to speed file transfer?

Tue May 05, 2020 3:17 pm

IllustriousFrosting2 wrote:
Tue May 05, 2020 2:59 pm
Hello!

I am currently using Python to transfer a fairly large (~400MB-1GB) file directory over a network. Using just one port right now I predict it would take 45 minutes to complete the entire transaction. Would it be unreasonable to open a hundred or so ports to get the job done in about half a minute?

Thank you!!
Gut feeling says you would get no overall improvement.
If you include the time spent coding and debugging it then .....

Presumably you are using TCP which is very efficient.

The Linux disk cache will hide most of the disk I/O delays especially if you have a 4GB Pi (you are using a Pi4 with GiGe aren't you?)
Read the files with a decent block size (a power of two larger than 4KB - the "cp" command uses 128KB).
Consider posix_fadvise( fd, 0, 0, POSIX_FADV_SEQUENTIAL ) to double the prefetch.
Disk writes should be instant.
You might get a small improvement by using something faster than Python (C is more common for this sort of program).

Just guesses, never tried it!!!
Last edited by jahboater on Tue May 05, 2020 3:21 pm, edited 1 time in total.
Pi4 8GB running PIOS64

IllustriousFrosting2
Posts: 4
Joined: Sat May 02, 2020 12:27 pm

Re: Use large number of ports in Python to speed file transfer?

Tue May 05, 2020 3:20 pm

Thank you for the response!

- I'm currently using a Raspberry Pi 3 Model B.
- My code works by taking buffer "slices" of incoming data, gradually building a file's data, and having the program sleep for a few moments between each iteration. Otherwise, the program glitches for whatever reason and doesn't read the data correctly. I think that by using multiple ports at once will allow me to read more than one file simultaneously.
- Is there something obvious I'm doing wrong? I can share my code later.

Thank you again, this helps a ton!

jahboater
Posts: 5825
Joined: Wed Feb 04, 2015 6:38 pm
Location: West Dorset

Re: Use large number of ports in Python to speed file transfer?

Tue May 05, 2020 3:34 pm

IllustriousFrosting2 wrote:
Tue May 05, 2020 3:20 pm
- I'm currently using a Raspberry Pi 3 Model B.
Seriously, consider getting a Pi4 !!!
It has true GiGe ethernet connected directly to the SoC (not 100MBps shared with the USB2 controller).
The SD card read speed is doubled.
The memory is several times faster and available up to 4GB
The CPU cores are more modern Cortex-A72's with proper out-of-order execution.
There are two USB3 ports
IllustriousFrosting2 wrote:
Tue May 05, 2020 3:20 pm
Otherwise, the program glitches for whatever reason and doesn't read the data correctly.
That doesn't sound good.
It might be best to fix that first rather than add further complication.
You are using TCP I hope? (not UDP) TCP should be best for streaming data.
IllustriousFrosting2 wrote:
Tue May 05, 2020 3:20 pm
I think that by using multiple ports at once will allow me to read more than one file simultaneously.
Down the one single piece of wire?
Last edited by jahboater on Tue May 05, 2020 3:36 pm, edited 2 times in total.
Pi4 8GB running PIOS64

ejolson
Posts: 5477
Joined: Tue Mar 18, 2014 11:47 am

Re: Use large number of ports in Python to speed file transfer?

Tue May 05, 2020 3:35 pm

IllustriousFrosting2 wrote:
Tue May 05, 2020 3:20 pm
Thank you for the response!

- I'm currently using a Raspberry Pi 3 Model B.
- My code works by taking buffer "slices" of incoming data, gradually building a file's data, and having the program sleep for a few moments between each iteration. Otherwise, the program glitches for whatever reason and doesn't read the data correctly. I think that by using multiple ports at once will allow me to read more than one file simultaneously.
- Is there something obvious I'm doing wrong? I can share my code later.

Thank you again, this helps a ton!
There is a program rsync designed for transferring directories full of files from one computer to another. It has been optimised rather carefully over the years to run fast. If you are wanting to write your own program in Python to do the same, that is understandable. In that case please post the source code under the Python topic and include a link to this thread for reference.

jahboater
Posts: 5825
Joined: Wed Feb 04, 2015 6:38 pm
Location: West Dorset

Re: Use large number of ports in Python to speed file transfer?

Tue May 05, 2020 3:37 pm

ejolson wrote:
Tue May 05, 2020 3:35 pm
There is a program rsync designed for transferring directories full of files from one computer to another. It has been optimized rather carefully over the years to run fast.
I agree.

Other programs like scp will have an encryption overhead.
Pi4 8GB running PIOS64

ejolson
Posts: 5477
Joined: Tue Mar 18, 2014 11:47 am

Re: Use large number of ports in Python to speed file transfer?

Tue May 05, 2020 3:57 pm

jahboater wrote:
Tue May 05, 2020 3:37 pm
ejolson wrote:
Tue May 05, 2020 3:35 pm
There is a program rsync designed for transferring directories full of files from one computer to another. It has been optimized rather carefully over the years to run fast.
I agree.

Other programs like scp will have an encryption overhead.
The common use of rsync is through an ssh encrypted tunnel, which happens at wire speed on modern hardware, that is, fast processors with slow networking.

You are right, however, that it's possible to use rsync unencrypted. Compared to a Python program with built-in delays to pace things, even scp should be fast.

Return to “Networking and servers”