💾 Archived View for g.mikf.pl › gemlog › 2022-12-04-powershell.gmi captured on 2024-08-18 at 17:32:53. Gemini links have been rewritten to link to archived content
⬅️ Previous capture (2023-01-29)
-=-=-=-=-=-=-
Previously:
Imports System.IO Imports System.Management.Automation Imports System.Formats.Tar Imports System.IO.Compression <Cmdlet("New", "TarGz")> Public Class NewTarGzCommand Inherits Cmdlet <Parameter(ValueFromPipeline:=True, Mandatory:=True)> Public FileNames As String() <Parameter(Mandatory:=True, Position:=0)> Public RootPath As String Protected Overrides Async Sub ProcessRecord() Dim filePaths = From fileName In FileNames Select New With { .PathFileName = Path.Join(RootPath, fileName), .EntryName = fileName} Dim totalDirectorySize = (From filePath In filePaths Select FileLen(filePath.PathFileName) ).Sum() Dim tarStream = New MemoryStream(totalDirectorySize * 2) Dim tarWriter = New TarWriter(tarStream) Dim gzOut = New MemoryStream(totalDirectorySize) Dim gZipper = New GZipStream(gzOut, CompressionLevel.SmallestSize) Dim tarToGzPromise = tarStream.CopyToAsync(gZipper) For Each f In filePaths tarWriter.WriteEntry(fileName:=f.PathFileName, entryName:=f.EntryName) Next tarWriter.Dispose() Await tarToGzPromise WriteObject(gzOut.ToArray(), True) End Sub End Class
And the Powershell script to use it:
$GemCap = "~/gemcap" Get-ChildItem $GemCap -File -Recurse -Name | Where-Object { $_ -NotMatch "^(.*[\\\/])?\..*" } | New-TarGz (Resolve-Path $GemCap)
The script returns no output, redirecting to file results in an empty file.
I shouldn't really make a cmdlet that does both tar and gz because it won't be modular enough. I dislike how Powershell piping here apparently can only do sending whole collections after they are finished, but here we aren't going to have the archives too large (we're storing them in byte arrays anyway, after all), so maybe it would be good to write two separate cmdlets.
Imports System.IO Imports System.Management.Automation Imports System.Formats.Tar <Cmdlet("New", "Tar")> Public Class NewTarCommand Inherits Cmdlet <Parameter(ValueFromPipeline:=True, Mandatory:=True)> Public FileNames As String() <Parameter(Mandatory:=True, Position:=0)> Public RootPath As String Protected Overrides Sub ProcessRecord() Dim filePaths = From fileName In FileNames Select New With { .PathFileName = Path.Join(RootPath, fileName), .EntryName = fileName} Dim totalDirectorySize = (From filePath In filePaths Select FileLen(filePath.PathFileName) ).Sum() Dim tarStream = New MemoryStream(totalDirectorySize * 2) Dim tarWriter = New TarWriter(tarStream) For Each f In filePaths tarWriter.WriteEntry(fileName:=f.PathFileName, entryName:=f.EntryName) Next tarWriter.Dispose() WriteObject(tarStream.ToArray(), True) End Sub End Class
And running it with:
Get-ChildItem $GemCap -File -Recurse -Name | Where-Object { $_ -NotMatch "^(.*[\\\/])?\..*" } | New-Tar (Resolve-Path $GemCap) > examplefilename.tar
Now, we do get a file. A text file. A list of decimal byte values.
Our return type was an array of bytes.
People say the ways are
[io.file]::WriteAllBytes(filename, $_)
Set-Content -Path path -AsByteStream
This is laughable. Modern Powershell really has no better ways?!
But ok, I used the latter one for now.
The result turns out to be tar file with hidden contents - bsdtar 3.5.2 as well as 7-Zip only see the first file added, but I can see all of the files' contents when inspecting it raw with notepad. A 660KB tar file in which only the first added 4KB file is found.
"PaxHeaders" can be found repeatedly in files' headers, so that indicates WriteTar/WriteEntry defaulted to the "Pax" POSIX tar format mentioned as one of the options in its documentation.
So I decided to initialize TarWriter to another format:
New TarWriter(tarStream, TarEntryFormat.Ustar)
Ustar 2 POSIX IEEE 1003.1-1988 Unix Standard tar entry format.
The result: notepad inspection shows that the format is much different, but the result in bsdtar -t and in 7-Zip is the same. I even checked bsdtar -xvp just to be sure.
And the filenames of all the "hidden" files are also still there in the file!
I would also like to mention that just in case, I don't really have a way at hand to debug these DLLs as i Import-Module them in Powershell.
Should I do another attempt with `TarEntryFormat.Gnu`? Ughhh should I..
I lazily did, it seemed undistinguishable from the old file that i never checksummed, there is a chance i just looked at the old file or a result of the old DLL.
No idea where to go from there. But ok, I got that bsdtar at C:\Windows\system32\tar.exe
I should probably go on to use it and make a gzipping cmdlet
Funnily enough, redirecting the String produced by running that bsdtar 3.5.2 tar.exe like
tar cvpf "-" 'C:\Users\Mika Feiler\gemcap' > qwerty.tar
makes the String redirection produce a broken archive that is broken even for that bsdtar itself. And actually it seemed to me earlier there were some memory leaks with the - as the `f` argument, because some random strings like what looked like rubbish from PATH variable but also other 'deeper' stuff were occuring in the error messages.
The Process statement list runs one time for each object in the pipeline. While the Process block is running, each pipeline object is assigned to the $_ automatic variable, one pipeline object at a time.
The above is an excerpt from about_Functions, as I wanted to start writing my stuff in Powershell again, so I read up both about creating Cmdlets with about_Functions_Advanced as well as about regular functions, and decided to use the regular functions this time. The whole reason to stray from pure Powershell into .NET was not being able to do some too dotNETy things revolving around streams, but now it appears I no longer need to safe-haven in it for things to feel sane enough.
I was actually very dum seeing that I'm overriding a `ProcessRecord` method and not thinking any of it. I had the feeling that something may ultimately be off when I don't have debugging for the DLLs that i Import-Module, but decided it couldn't. Now I finally know roughly what those mysterious methods to override for Begin and End do.
How the heck did all the filenames go into the tar, though? My understandings of things clash so badly now. I gotta make another take at the Tar cmdlet.
Ooh, maybe because it were actually many tars appended to each other. I guess it's not too easy to distinguish multiple tars appended to each other from a single-made tars.
Imports System.IO Imports System.Management.Automation Imports System.Formats.Tar <Cmdlet("New", "Tar")> Public Class NewTarCommand Inherits Cmdlet <Parameter(ValueFromPipeline:=True, Mandatory:=True)> Public FileName As String <Parameter(Mandatory:=True, Position:=0)> Public RootPath As String Public Shared MEMORYSTREAM_KB = 1000 Private Shared ReadOnly Property MemoryStreamCapacity As Int32 Get Return MEMORYSTREAM_KB * 1024 End Get End Property Private tarStream = New MemoryStream(MemoryStreamCapacity) Private tarWriter = New TarWriter(tarStream, TarEntryFormat.Ustar) Protected Overrides Sub ProcessRecord() tarWriter.WriteEntry( fileName:=Path.Join(RootPath, FileName), entryName:=FileName) End Sub Protected Overrides Sub EndProcessing() tarWriter.Dispose Dim tarArrayResult As Byte() = tarStream.ToArray WriteObject(tarArrayResult, True) End Sub End Class
It now produces valid tars!
With just one issue, the backslashes in paths became middle-square-dot (``) characters.
Managed to get that fixed with adding a dumb character replace to the pipe:
| foreach-Object { $_ -replace '\\', '/' } |
These tars seem to be fine.
I discovered that 7z has the format gzip not do tar, so I could use it for compression. But now that I got the thing to work, maybe I should do gzip in .net while i'm at it.
Imports System.IO Imports System.IO.Compression Imports System.Management.Automation <Cmdlet("Compress", "Gzip")> Public Class CompressGzipCommand Inherits Cmdlet <Parameter(ValueFromPipeline:=True, Mandatory:=True)> Public Input As Byte Public Shared MEMORYSTREAM_KB = 1000 Private Shared ReadOnly Property MemoryStreamCapacity As Int32 Get Return MEMORYSTREAM_KB * 1024 End Get End Property Private ReadOnly result = New MemoryStream(MemoryStreamCapacity) Private ReadOnly zipper = New GZipStream(result, CompressionLevel.SmallestSize) Protected Overrides Sub ProcessRecord() zipper.WriteByte(Input) End Sub Protected Overrides Sub EndProcessing() zipper.Dispose() Dim resultBytes As Byte() = result.ToArray WriteObject(resultBytes) End Sub End Class
invoked
Get-Content -AsByteStream -Path .\file.tar | Compress-Gzip | Set-Content -AsByteStream -path .\file.tar.gz
does produce a valid tar.gz accepted by `tar xz`.
Yeah it does work it byte by byte because I wasn't sure what the behavior could be with multiple byte arrays being processed. But doesn't seem too bad and at least is streaming.
PS C:\Users\Mika Feiler> Get-ChildItem $GemCap -File -Recurse -Name | >> Where-Object { $_ -NotMatch "^(.*[\\\/])?\..*" } | >> foreach-Object { $_ -replace '\\', '/' } | >> New-Tar (Resolve-Path $GemCap) | Compress-Gzip | set-content -path realshit.tar.gz -asbytestream PS C:\Users\Mika Feiler> Invoke-RestMethod -Method Post -Uri "https://pages.sr.ht/publish/g.mikf.pl" -Authentication OAuth -Token (Read-Host -AsSecureString) -Form @{ protocol="GEMINI"; content = Get-ChildItem .\realshit.tar.gz }
equivalent
Get-ChildItem $GemCap -File -Recurse -Name | Where-Object { $_ -NotMatch "^(.*[\\\/])?\..*" } | foreach-Object { $_ -replace '\\', '/' } | New-Tar (Resolve-Path $GemCap) | Compress-Gzip | set-content -path realshit.tar.gz -asbytestream Invoke-RestMethod -Method Post -Uri "https://pages.sr.ht/publish/g.mikf.pl" -Authentication OAuth -Token (Read-Host -AsSecureString) -Form @{ protocol="GEMINI"; content = Get-ChildItem .\realshit.tar.gz }
Ok but now I just did that and forgot about my tinylog gemfeed generator
Now that one is quirky because its very spec declares the dates suitable for `date -d`, tying it to implementation.
PS C:\Users\Mika Feiler> get-date -date "Nov18 2022 9:30 CET"
Get-Date: Cannot bind parameter 'Date'. Cannot convert value "Nov18 2022 9:30 CET" to type "System.DateTime". Error: "The string 'Nov18 2022 9:30 CET' was not recognized as a valid DateTime. There is an unknown word starting at index '16'."
PS C:\Users\Mika Feiler> get-date -date "Nov 18 2022 9:30 CET"
Get-Date: Cannot bind parameter 'Date'. Cannot convert value "Nov 18 2022 9:30 CET" to type "System.DateTime". Error: "The string 'Nov 18 2022 9:30 CET' was not recognized as a valid DateTime. There is an unknown word starting at index '17'."
For now I had to resort to bash from a unlikely-not-unlikely place
& 'C:\Program Files\Git\bin\bash.exe' -c 'cd g.mikf.pl; date -d "$(grep "^## " tinylog.gmi | cut -c 4- | head -1)" +"# Mika Feiler%n=> tinylog.gmi %F Tinylog" > gemfeed-tinylog.gmi'