2019-07-02 Mojolicious help needed

I have this app that makes a gazillion web requests: at least two of them for every account in a list.

  my @results = map { overview $_ } @accts;

This is what the code looks like:

sub overview {
  # HACK ALERT: plenty of shortcuts here which might only work for Mastodon...
  my $account = shift;
  my ($username, $domain) = split "@", $account;
  my $ua = Mojo::UserAgent->new();
  my $result;
  # We should get the first URL from here, looking at the "aliases" key:
  # curl "https://octodon.social/.well-known/webfinger?resource=acct%3Akensanata%40octodon.social"
  my $url = "https://$domain/users/$username";
  my %obj = (id => $account, url => $url, bio => '', published => '');
  eval {
    $result = $ua->max_redirects(2)->get($url => {Accept => "application/json"})->result;
  };
  if ($@) {
    $obj{bio} = "<p>Error: $@</p>";
    return \%obj;
  }
  if (not $result->is_success) {
    $obj{bio} = "<p>" . $result->code . ": " . $result->message . "</p>";
    return \%obj;
  }
  $obj{bio} = $result->json->{summary};
  my $outbox = $result->json->{outbox};
  # We should get this URL from the previous one:
  # curl -H 'Accept: application/json' https://octodon.social/users/kensanata
  # gives us the "outbox" key and the value is a URL which we can fetch again
  # curl https://octodon.social/users/kensanata/outbox
  # and that gives us a short description including the "first" key which gives us a bunch of statuses
  # and we just look at the first one
  $url = "$outbox?page=true";
  eval {
    $result = $ua->max_redirects(2)->get($url => {Accept => "application/json"})->result;
  };
  if ($@) {
    $obj{published} = "<p>Error: $@</p>";
    return \%obj;
  }
  if (not $result->is_success) {
    $obj{published} = "<p>" . $result->code . ": " . $result->message . "</p>";
    return \%obj;
  }
  $obj{published} = $result->json->{orderedItems}->[0]->{published};
  return \%obj;
}

When I tried to rewrite it yesterday using promises, I failed.

I called the code as follows:

my @results = overview($c, $name, @accts);

And here’s the code using Mojo::UserAgent's `get_p` which returns a promise and Mojo::Promise's `all` which waits for them.

sub overview {
  my $c = shift;
  my $name = shift;
  my @accounts = @_;
  # Wrap continuation-passing style APIs with promises
  my $ua = Mojo::UserAgent->new->max_redirects(2)->inactivity_timeout(20);
  my @promises;
  for my $account (@accounts) {
    my ($username, $domain) = split "@", $account;
    # "https://octodon.social/users/kensanata"
    my $url = "https://$domain/users/$username";
    push @promises, $ua->get_p($url => {Accept => "application/json"});
  }
  warn "@promises";
  Mojo::Promise->all(@promises)
      ->then(sub {
	warn "@_";
	my @results;
	for my $promise (@_) {
	  my $result = $promise->[0]->result;
	  if ($result->is_success) {
	    push @results, $result->json;
	  } else {
	    push @results, { id => "A", url => "B", name => "C",
			     summary => "<p>" . $_->code . ": " . $_->message . "</p>" };
	  }
	}
	$c->render(template => 'do_overview', name => $name, accounts => \@results)})
      ->catch(sub {
        my $err = shift;
        warn "Connection error: $err";
	      })
      ->wait;
}

When I run this code, with a list of two accounts whose hosts are up and running, I get a connection error. The line with the hashes is the `warn` line in my code which shows that I do in fact have two promises.

Mojo::Promise=HASH(0x559e19d48038) Mojo::Promise=HASH(0x559e1c099468)
Connection error: Premature connection close

I’m staring at the manual pages for Mojo::UserAgent and Mojo::Promise and just don’t understand what I need to change.

A note on unknown hosts: The two accounts I’m checking here are on reachable hosts, so I don’t understand what the premature closing is all about. But even if they were unreachable, I need the code to not abort. Sadly, many of the accounts I’m checking are on hosts that no longer exist, which is part of the reason I need to check them. That’s why the code above wraps the `get` call in an `eval` block. I need to do something like that, somewhere.

Anyway, I wrote a little test script to try and get a minimal working example, but that works as intended:

use Modern::Perl;
use Mojo::UserAgent;
use Mojo::Promise;

my $ua = Mojo::UserAgent->new;
my @accounts = qw(kensanata@octodon.social kensanata@dice.camp);
my @promises;
for my $account (@accounts) {
  my ($username, $domain) = split "@", $account;
  my $url = "https://$domain/users/$username";
  warn $url;
  push @promises, $ua->get_p($url => {Accept => "application/json"});
}
warn "@promises";
Mojo::Promise->all(@promises)
    ->then(sub {
      warn "@_";
      my @results;
	for my $promise (@_) {
	  my $result = $promise->[0]->result;
	  if ($result->is_success) {
	    push @results, $result->json;
	  } else {
	    push @results, { id => "A", url => "B", name => "C",
			     summary => "<p>" . $_->code . ": " . $_->message . "</p>" };
	  }
	}
      say "@results" })
    ->catch(sub {
      my $err = shift;
      warn "Connection error: $err";
	    })
    ->wait;

The output:

https://octodon.social/users/kensanata at test.pl line 11.
https://dice.camp/users/kensanata at test.pl line 11.
Mojo::Promise=HASH(0x5595942869b8) Mojo::Promise=HASH(0x559595a8cc38) at test.pl line 14.
ARRAY(0x559595b20598) ARRAY(0x559595b39080) at test.pl line 17.
HASH(0x559595b20748) HASH(0x559595b396c8)

So... what’s my problem? I asked on IRC. There’s a `#mojo` channel on Freenode. Users `mst` and `CandyAngel` helped me out. The problem was that my code used a variable `$ua` for the user agent inside the `overview` sub. The sub finishes while the promises are still waiting, and thus the `$ua` goes out of scope immediately. The solution is to make sure the user agent is kept alive.

One way is to add `->finally(sub { undef $ua })` in there, or even better: just use the app’s own user agent!

my $ua = $c->app->ua;

That fixed it!

Well... except that stuff still didn’t run in parallel. As I discovered, that’s not what promises are for!

I had two options: previously, I had used `MCE::Loop` to run jobs in parallel (in the code that’s currently disabled). I remember trying to figure out how I might use Mojolicious to do it and failing. So today I tried again. It turns out that you can do it, if you use `Minion`.

Sadly, I again ran into many issues. The documentation is never as straightforward as I expect it to be. The uses cases aren’t clear to me. For example, I could not get it to work with using the default worker, `app->minion->worker`. If I had my own worker and ran `perform_jobs` it didn’t run in parallel. If I ran `run` then it didn’t return. On the `#mojo` channel they said that I should just start the workers once and then leave them. As long as the queue is empty, they aren’t wasting resources. But I just couldn’t get it to work. So in the end I copied an example from the manual that was labelled “a custom worker performing multiple jobs at the same time.” That did the trick.

I still feel bad about using a database backend. I would not have minded an in-memory solution! Perhaps in the end I should have just used `MCE::Loop`.

This uses a temporary database:

plugin Minion => { SQLite => ':temp:' };

This adds a task at the top level:

app->minion->add_task(overview => sub {
  my ($job, $account) = @_;
  $job->finish(overview $account) });

And this is the code that uses it, with up to 40 requests in parallel and checking every second if they’re finished and if we should start more.

  my @ids = map { $c->app->minion->enqueue(overview => [$_]) } @accounts;

  my %jobs;
  my $worker = $c->app->minion->repair->worker->register;
  do {
    for my $id (keys %jobs) {
      delete $jobs{$id} if $jobs{$id}->is_finished;
    }
    if (keys %jobs >= 40) { sleep 1 }
    else {
      my $job = $worker->dequeue(1);
      $jobs{$job->id} = $job->start if $job;
    }
  } while keys %jobs;
  $worker->unregister;

  my @results = map { $c->app->minion->job($_)->info->{result} } @ids;
  $c->render(template => 'do_overview', name => $name, accounts => \@results);

I think now it works. 😓

​#Mojolicious ​#Perl