Blank screen when using the migration plugin

invisnet · February 26, 2020, 1:35pm

It could very well be the opcache. It’s easy enough to find out:

var_dump(opcache_get_configuration());

If you can run that on the same server it’ll tell you how the host has configured the opcache. opcache.revalidate_freq is the most obvious thing to look for.

trying · February 27, 2020, 3:31pm

Thanks @wadestriebel @invisnet @james

First I’ll say that configuring opcache is new to me, but I have invested some time reading and experimenting.

Value in seconds.

Obviously in a custom development environment it’s probably at 0 and all is good.

Default vs commonly set values appear to range:

2 (default according to PHP: Runtime Configuration - Manual)
60 popular in several tech sites I visited
3600 in Production environment or even negated with opcache.validate_timestamps set to FALSE but hopefully these users would be well aware of the issue.

Shared hosting environments will presumably vary.

I used a replica and tried a limited number of runs at 2, 5, 10, 15, 60. Based on limited number of runs:

60 was always problematic
It worked once at 15 but failed once at 10, so obviously timing dependent.
2 was OK (but again limited runs)

Unfortunately it appears this could cause server load issues and upset hosts:

After some experimenting I could reduce the problem of the loads to just this: opcache_reset()

Once I perform this call (independently of any prior or later deployment steps, this happens also in isolation when I just call this endpoint) there’s a chance the the system load suddenly spikes.

If that happens and the load is “too high” (I would say from experience > 200 or so), the system becomes unresponsive until for seconds or minutes, depending

Based upon my crash-course, this may be a better solution (but need more coding):

https://www.php.net/manual/en/function.opcache-invalidate.php

and there could still be other issues.

How do other installers, etc. handle it? (noting that installing new files is less problematic than file changes/ overwrites)

IIRC when I upgraded a plugin it updates instantly but it is seconds later before it says plugin ‘reactivated’.

james · February 27, 2020, 8:12pm

Interesting, I definitely see how this could be a problem for high-traffic servers running large applications.

We could use opcache_invalidate instead, but this needs to be called for every single file that is changed, which in the case of a migration is really just about every file.

We would also probably want to make the ClassicPress upgrade process only replace and invalidate files that are actually changed, right now it just replaces everything.

Can you post all of the opcache configuration settings in your server config please? In particular, opcache.validate_timestamps, since it looks like if that is disabled it would cause this problem, according to the accepted answer at your link.

In any case this is pretty convincing evidence of the opcache being responsible for this issue. Thanks for investigating further!

joyously · February 28, 2020, 3:11pm

Wouldn’t the same issue show up for a WP update or a CP update?

trying · February 29, 2020, 7:18pm

From my limited understanding that sounds like it is correct and would be best practice. Certainly better than calling opcache_reset

That sounds optimal.

I do not have it here but I can confirm 100% that opcache.enable and opcache.validate_timestamps was true.

I also tried opcache.revalidate_freq at 3600 and after running the tool, the site (public+admin) was blank for an hour before coming back to life, as expected.

Therefore I would agree:

Glad to help (I hope!)

trying · February 29, 2020, 7:21pm

Good point!

When I used the tool to revert to WP v5.3.2 it did load the updated page but not 100% correctly - there were content / layout issues.

The mystery continues.

james · March 1, 2020, 3:23am

I think this is as expected. The issues would be random depending on which files loaded stale versions.

Still, a migration from WP->CP or CP->WP is the most likely to cause issues, just because more files are being replaced during this process.

I think this code is a reasonable approach towards fixing this issue: Invalidate the PHP opcache when files are upgraded by nylen · Pull Request #567 · ClassicPress/ClassicPress · GitHub

There is still a lot left to do though - it needs thorough testing using a custom ClassicPress build on servers with different opcache configurations. Also, there is an open WP ticket about this, so it would be good to coordinate the same fix across both projects: https://core.trac.wordpress.org/ticket/36455

trying · March 2, 2020, 2:41am

Yes, was tired! Same effect - different file/changes cause different visible outcome, but the same underlying issue. Also core files obviously more likely to impact something.

Documented code - a sight for sore eyes

They may have reached the same conclusion:

I agree with a few folks in here that ideally the flush would be targeted to the exact files that changed. This would help avoid causing unnecessary churn on servers …

Did not look for their PR though.