Create DOC about safe code

WP documentation is scarce, and sometimes even plain wrong about how to handle safety.

We need a full blown doc about code safety, specially related to localised strings, escaping, sanitising and such things.

Please if anyone has good examples about classic safety issues in PHP/WP share here so we can then add to doc.

One classic that clearly is going to make it is the one that not even WP Doc gets right:

If you follow the suggestions there you will ruin your own HTML, and while “it will be safe” it will also be a broken link.

I think to add these things to the DOC:

Escaping and Sanitizing
Using YODA conditions
Sanitizing $_GET, $_POST, and other user input vars
Form submission nonce checks
The things you did not think about when using unserialize

I have drafts and examples for all above sections.

Does anyone know about something else, recurring all the time, and basically not properly documented anywhere so far?
Please share so I can include.

1 Like

I am tagging @invisnet about this.

1 Like

Hardening WordPress – WordPress.org Documentation
points to
Security – WordPress.org

Brute Force Attacks – WordPress.org Documentation
Changing File Permissions – WordPress.org Documentation

Plugins: Plugin Security | Plugin Developer Handbook | WordPress Developer Resources
covers how to check user capabilities, validate and sanitize input, sanitize output and create and validate nonces.

I think the Yoda conditions make the code harder to read. That is part of the WPCS, though.
Core handles serialization for options, but it’s probably good to mention the risks of doing it at all, if plugins write their own db tables.

We should re-use the content and create individual pages similar to WP resources. These are good and valuable resources we should make available to developers. They can be linked to from one main “security” page.

About YODA: errors that are catched by using this technique are easily catched with a simple phpcs rule.
Better using strict comparisons. If you forget to type a char you are comparing anyway :laughing:

2 Likes

Strict type comparisons are also part of WPCS, though. :wink:

Strongly agree with @Simone. The better approach to Yoda conditions is… to stop recommending Yoda conditions. Strict comparisons (=== and !== instead of == and !=) should be the recommendation as it’s the proper way to do comparisons, it’s more readable, and it completely negates the “need” for Yoda conditions.

5 Likes

With how ubiquitous JS has become, a document about safer coding should definitely include something on preventing XSS attacks when using vanilla JavaScript. It so happens Chris Ferdinandi just put out a series on this a few days ago.

These articles are from Chris’ daily newsletter. It’s one of the best I’ve ever subscribed to. Super informative, easy to understand, no tracking pixels or UTM-decorated links – just good, sound information. Highly recommended for anyone wanting to learn more about vanilla JS.

2 Likes

We haven’t documented a coding standard yet, but as far as I’m concerned this is the recommendation of the ClassicPress project for all new code.

2 Likes

I was writing up drafts of this on the doc page when I reached the topic “escape” and got in a self-contradicting limbo of “escape, but not”.

As those who follow #programming channel in slack probably know, I already determined that “escape everything” is not what happens in Core Code, nor what should happen in all cases.

WordPress itself does not, in no way escape the_content or the_title for example, and it does not sanitize the content either, on save, at least not for WP Admins/unfiltered_html users.
Googling this topic you will find all sort of garbage explanations as of why this is the case and what to do about it.
Literally you will find people saying “you don’t need to worry because WordPress escapes it for you” up to “It is safe because it applies the_content Filter”. Google it, it is worth the time to see how badly informed the world is about this problematic.

Not one post I found does in fact explain why it is not escaped, and how WordPress ensures that this is safe anyway.

What I have found by investigating this myself is the following:

  1. WordPress does not escape in any way said functions, neither in the very function nor in upstream functions, because if it would, we could barely blog post with simple HTML and never for example include a Form on a page. We would have to pass an immense array to wp_kses, which would never be complete, as we could never guess for example custom HTML attributes passed to a simple DIV by some user. Effectively that would make it very hard to let someone create more than “just a simple blog”.
  2. WordPress ensures that the_content echoed is safe, by making sure only trusted users can save certain content type
    Nota bene: not by sanitising, but by checking caps.
    Mainly, that is called the “unfiltered_html” capability. With this cap, you can save anything (including malicious JS or else stuff) to your post content, and it will display/render as such in the front end. As soon an editor without that cap would re-save your post or attempt to save such content it would get stripped or escaped.

You can also read more about this on the_title WP Doc, or here Reporting Security Vulnerabilities – Make WordPress Core

Thus, that much is clear: escaping always no matter what is not the correct advice.

The thing is - instances like WP VIP for example are hard on with the statement “You must escape everything”.
Of course, they do not even mention that the_content is unescaped (see Validating, sanitizing, and escaping · WordPress VIP Documentation).

The problem I have now is, if I say in the DOC “you must escape always” then we contradict the very core code.
If I say “You must escape only when needed”, then I need to specify when you have to escape.

Technically, it is not safe to assume that even the_content is “safe to not escape”.
This because it applies filters, and thus, a program can alter its contents even after you save it, not just an editor with unfiltered_html caps.
And that would in turn mean you must escape even the the_content.
Which then effectively would disallow you from putting any form on your website, just as an example.

So I think the approach WP takes here is “If you have 5 holes in a wall and a flood comes, and you can patch 3 of them, you patch 3 of them, so you only have to worry about 2”.
However I would like to create a more useful doc than using phrases like the above.

I thought of this:

You should escape always as late as possible, but if you for some reason you cannot (like when outputting the_content or when creating special outputs requiring special HTML), then you have the choice to not escape. In those cases, you must ensure that the input of said content is as safe as possible.
For example, if you where to accept user input thru a POSTed value by any user role (inclusive but not limited to guests), of course (apart of sanitisation) escaping is a must.
However if you where to work with a database stored value that only admins, or users with unfiltered_html can store, then you may not escape.
The rule of thumb is do escape always and as late as possible. If you cannot, make sure the input it echoes is produced by trusted users. Craft your code in a way that it is easy for any reviewer to immediately and without lot of effort reading long code to determine that indeed the input you echo is safe.
Do NOT exclude said lines with WPCS “ignore” statements when submitting a plugin for review, instead, add a clear comment to your code that helps the reviewer to follow the path to your input, to determine if indeed the content is safe.

What is everyone’s take on this?
Suggestions as of how to explain this without creating a whole lot of confusion but also without hiding the fact that core functions like the_content do not escape and that you do indeed have the possibility to use unescaped content?

I would very much like to hear @pluginvulns opinion on this as well, I assume you guys are aware of the unescaped the_content and doing plugin scans you will come across such situations then and when. Any suggestion in regard?

Not an expert, but I did try to read up on this a few years ago. The two resources I found most helpful are these two, because they had examples and explanations of the core functions and what/when should be escaped (especially the second one):

https://developerka.org/2018/05/30/escape-core-functions-wporg-themes/#get_bloginfo

yea, the first post you linked is in my list above - it states that you should escape the_content, which is simply not what happens (not even Theme Twenty Twenty-one does that)
The post is amongst the better sources thou, because at least it explains what will happen if you do escape it, and that WP does not escape it, as opposed to the other sources I linked above (there are 4 links in the phrase sort of garbage explanations as of :rofl:)

The second link you share I like, it explains how to use it (although it also misses the part as of how to make it safe)

Thanks Viktor, I think I will use that second source as a reference in the doc, it is useful indeed as a handy list of what to escape, at least for most functions.

1 Like

Are you going to mention escaping translations? That isn’t done in core either. Everything that has a filter could introduce a markup glitch at the least or a vulnerability at the worst. But these are low risk because of having to be able to get the code to run. If you can compromise the code, you can do anything you want to the site…

1 Like

Yes, I have that part sorted already (actually thru a WP Ticket I opened in their doc, as it was wrong). I know core doesn’t escape most translated things, have seen that (and also seen it in plugins).

There I think the rule can apply that translations are possibly from untrusted users - thus if not reviewed, pose a risk. And that justifies the “escape always”, imo, as well the non-risk form side of core since those translations are reviewed.

About “If you can compromise the code, you can do anything you want to the site…”, yes, that is exactly the source of this issue somehow.
Because technically most things saved are saved by users who can do this (forgetting about $_POST from front end at the moment).
Yet the requirements is generally (for example by plugin reviews) that you even escape options that only admins can save.
So this leads to the rather confusing state of things where it is somehow not clear wether yes or no to escape.

Do you think the proposed draft I have here Create DOC about safe code - #9 by anon66243189 makes sense?
(Referring to the part after I thought of this:)

It’s kind of the only way I found to justify both “Do escape always” with the exceptions and how to handle those.

Yes and no.
It is still a problem to use “where” instead of “were” in your writing.

Where is this document going and who is the target audience?
I think the fact that isn’t changing is that HTML is stored in certain fields of the database and those fields cannot be treated the same as raw data fields.
For the core contributor audience, the message would be to sanitize all inputs and escape what can be escaped, and check nonce and capabilities for all inputs.
For plugin and theme author audience, the message would be to make any new inputs raw data instead of HTML. Sanitize all inputs and escape when output. Let core handle core data. Check nonce and capabilities for all inputs (well, login is an exception). If HTML is necessary for input (like in comments), limit it by using one of the kses functions.

I think you need an example to make it clear why you escape. It’s not just about an exploit. Yesterday, I was modifying a plugin that displays Amazon products using a shortcode. I had used a template with the product title placeholder in the image alt tag, so it would be accessible. Unfortunately, the plugin wasn’t escaping the data from the API and the product title containing a " messed up the alt attribute. The browser displays it fine but it is invalid HTML and wouldn’t work with a screenreader. However, the escaping that I added to the plugin also limits the chance that some guy selling something on Amazon puts a <script> tag into his product title and messes with sites all over the internet. That is unlikely to be allowed over at Amazon, but this is the concept.

1 Like

Ok that’s helpful

I’ll rewrite that draft part and ask again once done.

Thx!

Same here, no todo tag.
If someone can add it, that would be great.