💾 Archived View for perso.pw › blog › articles › unix-split.gmi captured on 2023-07-10 at 13:52:53. Gemini links have been rewritten to link to archived content
⬅️ Previous capture (2023-05-24)
-=-=-=-=-=-=-
Today I will present the userland program "split" that is used to split a single file into smaller files.
Split will create new files from a single files, but smaller. The original file can be get back using the command cat on all the small files (in the correct order) to recreate the original file.
There are several use cases for this:
- store a single file (like a backup) on multiple medias (floppies, 700MB CD, DVDs etc..)
- parallelize a file process, for example: split a huge log file into small parts to run analysis on each part
- distribute a file across a few people (I have no idea about the use but I like the idea)
Its usage is very simple, run split on a file or feed its standard input, it will create 1000 lines long files by default. -b could be used to tell a size in kB or MB for the new files or use -l to change the default 1000 lines. Split can also create a new file each time a line match a regex given with -p.
Here is a simple example splitting a file into 1300kB parts and then reassemble the file from the parts, using sha256 to compare checksum of the original and reconstructed files.
solene@kongroo ~/V/pmenu> split -b 1300k pmenu.mp4 solene@kongroo ~/V/pmenu> ls pmenu.mp4 xab xad xaf xah xaj xal xan xaa xac xae xag xai xak xam solene@kongroo ~/V/pmenu> cat x* > concat.mp4 solene@kongroo ~/V/pmenu> sha256 pmenu.mp4 concat.mp4 SHA256 (pmenu.mp4) = e284da1bf8e98226dc78836dd71e7dfe4c3eb9c4172861bafcb1e2afb8281637 SHA256 (concat.mp4) = e284da1bf8e98226dc78836dd71e7dfe4c3eb9c4172861bafcb1e2afb8281637 solene@kongroo ~/V/pmenu> ls -l x* -rw-r--r-- 1 solene wheel 1331200 Mar 21 16:50 xaa -rw-r--r-- 1 solene wheel 1331200 Mar 21 16:50 xab -rw-r--r-- 1 solene wheel 1331200 Mar 21 16:50 xac -rw-r--r-- 1 solene wheel 1331200 Mar 21 16:50 xad -rw-r--r-- 1 solene wheel 1331200 Mar 21 16:50 xae -rw-r--r-- 1 solene wheel 1331200 Mar 21 16:50 xaf -rw-r--r-- 1 solene wheel 1331200 Mar 21 16:50 xag -rw-r--r-- 1 solene wheel 1331200 Mar 21 16:50 xah -rw-r--r-- 1 solene wheel 1331200 Mar 21 16:50 xai -rw-r--r-- 1 solene wheel 1331200 Mar 21 16:50 xaj -rw-r--r-- 1 solene wheel 1331200 Mar 21 16:50 xak -rw-r--r-- 1 solene wheel 1331200 Mar 21 16:50 xal -rw-r--r-- 1 solene wheel 1331200 Mar 21 16:50 xam -rw-r--r-- 1 solene wheel 810887 Mar 21 16:50 xan
If you ever need to split files into small parts, think about the command split.
For more advanced splitting requirements, the program csplit can be used, I won't cover it here but I recommend reading the manual page for its usage.