💾 Archived View for d.moonfire.us › blog › 2012 › 04 › 29 › an-obsession-with-data-normalization captured on 2024-12-17 at 10:01:00. Gemini links have been rewritten to link to archived content
⬅️ Previous capture (2023-04-26)
-=-=-=-=-=-=-
while () { # Clean up the line. chomp;
# Pull out the fist column. my @p = split(/t/); my $f = $p[0]; # Test to see if the file exists. if (! -f $f) { # The file doesn't exist, so complain. print STDERR "Can't find $fn"; }
}
%seen = ();
while () { # Clean up the input line. chomp;
# Split out the columns. my @p = split(/t/); # If we already saw it, tell the user we skipped it. If not, then # print it out and add it to the hash so we don't print it again. if (exists $seen{$p[0]}) { print STDERR "Skipping $p[0]n"; } else { print "$_n"; $seen{$p[0]} = 1; }
}
In Miw?fu, they call those who cannot use magic `miw: bachir?ma`. Translated into Lorban, it means “cursed to be forever mundane.” – Awakened Magic, Dastor Malink $
use strict; use warnings;
my $dir = $ARGV[0];
die “USAGE: dir input” unless -d $dir;
my $input = $ARGV[1];
die “USAGE: dir input” unless -f $input;
my %files = ();
open FILES, "<$input" or die "Cannot open input $input ($!)";
while () { chomp; my @p = split(/t/); $files{$p[0]} = $p[1]; }
close FILES;
open PIPE, “find ‘$dir’ -type f |” or die "Cannot open pipe ($!)";
while () { # Clean up the line. chomp;
# Ignore . files. next if m@/.@; next unless m@.txt$@; # Trim off the leading characters. s@./@@; # See if the file exists. my $file = $_; if (exists $files{$file}) { # We found the file. my $date = $files{$file}; print "HIT $date $filen"; # Pull out the entries so we can report what was missing after # we're done processing. delete $files{$file}; # Open up the file and read in the metadata section, looking # for an already existing date. my $found_date = 0; open FILE, "<$file" or die "Cannot open $file ($!)"; while () { # Clean up the line. chomp; if (m@^* Date:s*(.*?)$@) { my $old_date = $1; $found_date = 1; if ($date eq $old_date) { # Nothing to do, we're good. last; } print " $1n"; } } close FILE; # If we didn't find the date, we need to add it into the file. unless ($found_date) { my $need_date = 1; print " Date: $daten"; open IN, "tmp" or die "Cannot open tmp ($!)"; while () { print OUTPUT $_; if ($need_date && $_ =~ /^= /) { print OUTPUT "* Date: $daten"; $need_date = 0; } } close IN; close OUTPUT; rename($file, "$file.bak"); rename("tmp", $file); } } else { # Print the file. #print "SKIP $_n"; }
}
close PIPE;
foreach my $file (sort(keys(%files))) { print “MISS $filen”; }
Categories:
Tags:
Below are various useful links within this site and to related sites (not all have been converted over to Gemini).
https://d.moonfire.us/blog/2012/04/29/an-obsession-with-data-normalization/