Arc Forumnew | comments | leaders | submitlogin
Inline JavaScript
3 points by hjek 2287 days ago | 14 comments
In the Anarki repo, I noticed this comment above the `js-ext` in `lib/html.arc`:

    ; for strict CSP rules, inline code is allowed if the SHA-256 hash is included
    ; in the CSP headers. So it's better to either not restrict inline code, or
    ; move all javascript to an external file, including event binding.
From this I understand that a strict content security policy would disallow inline JavaScript unless a hash is provided in the headers, presumably because otherwise it could potentially be modified by an attacker.

I don't know what the risk is for these kind of attacks, but if we assume that inline JavaScript is bad and getting rid of it is good:

- Should it be disallowed in CSP rules in Arc?

- Does this mean that `votelink` should be modified to not use the `onclick` attribute?



3 points by i4cu 2287 days ago | link

In order for the News app to have a CSP and be strict about it, you would need to:

1. Remove the inline js. This means the votelink code (votejs) needs to be moved from news.arc and put into an external file (news.js?) that is linked to as a file within the header.

2. The inline onclicks need to change. The onclick values have to be actual function pointers not strings and given Arc has no built in js functionality that likely means removing them completely. Instead you will need to have a js call in the new 'news.js' file that does document.addEventListener with the 'DOMContentLoaded' argument along with a function that finds all the relevant doms for a given page and adds listeners to each that will trigger the votejs code.

3. srv.arc needs the addition of a Content-Security-Policy header for server ops (with the appropriate settings).

4. All inline style attributes need to be removed and changes to news.css or news.js will need to be made in order to compensate. i.e. stuff like this:

   (div style "margin-top:1px;margin-bottom:0px")
edit #1. note that adding the hash code referred to (or even the 'nonce' option) is a hack intended to provide short term relief to production environments until proper changes can be implemented.

edit #2. regarding point 4 I believe (but not absolutely sure of) that all the inline font, color, font-size tags are a problem too. i.e. It's any tagged string value that will be interpreted by the browsers css engine. If anyone can confirm this, please do. Either way, none of that stuff is HTML 5 compliant and probably should be removed anyway.

-----

3 points by krapp 2287 days ago | link

>srv.arc needs the addition of a Content-Security-Policy header for server ops (with the appropriate settings).

What's needed is the ability to pass custom headers from the application to srv.arc (or maybe app.arc) since CSP headers would be application specific. Unfortunately, unless I'm wrong, it looks like header generation is baked into srv.arc.

> All inline style attributes need to be removed and changes to news.css or news.js will need to be made in order to compensate. i.e. stuff like this:

A lot of that can be removed altogether by removing the table layout and just using a basic grid. There's no reason the forum has to be pixel-perfect. This would have the added benefit of letting us get rid of a lot of hacky one-off table macros in html.arc.

-----

4 points by i4cu 2287 days ago | link

> What's needed is the ability to pass custom headers from the application to srv.arc

There is the possibility of just putting the CSP into a meta tag within the page header, but I didn't suggest that because not all CSP options are available when using the meta tag.

I think you're right in that being able to dynamically add headers is the right way to go. When I moved from arc to clojure I did this by implementing something like arc templates [1] and used them to pass attributes through to the server ops. I ended up with a 'defop' like call that took an options hash-map argument (i.e. a template instance) which then generated the headers dynamically (with built in sane defaults).

1. http://arclanguage.github.io/ref/template.html

> A lot of that can be removed altogether by removing the table layout and just using a basic grid...

Yeah the whole thing should get HTML5-alived. CSS, JS and web-standards have evolved significantly since the app was originally written.

-----

3 points by krapp 2246 days ago | link

>Yeah the whole thing should get HTML5-alived. CSS, JS and web-standards have evolved significantly since the app was originally written.

Sorry to necro this but I'm hoping to have an update to the HTML and CSS finished soon.

I've consolidated the page templates to a single macro for the entire document, removed the tables and inline css, and got the threads working with unordered lists. I think the result might be as close as we can get in Arc to what you're describing (which is the way most frameworks operate - a single base template) without a massive rewrite of src.arc and html.arc as well.

-----

3 points by rocketnia 2239 days ago | link

"Yeah the whole thing should get HTML5-alived. CSS, JS and web-standards have evolved significantly since the app was originally written."

I kept wanting to bring up some of pg's writing on this topic, but I couldn't find it until now:

http://paulgraham.com/arc0.html

pg: "Arc embodies a similarly unPC attitude to HTML. The predefined libraries just do everything with tables. Why? Because Arc is tuned for exploratory programming, and the W3C-approved way of doing things represents the opposite spirit.

[...]

Tables are the lists of html. The W3C doesn't like you to use tables to do more than display tabular data because then it's unclear what a table cell means. But this sort of ambiguity is not always an error. It might be an accurate reflection of the programmer's state of mind. [...]

Good cleanness is a response to constraints imposed by the problem. Bad cleanness is a response to constraints imposed from outside-- by regulations, or the expectations of powerful organizations."

Personally, I think using semantic HTML isn't that big a deal to implement, and it seems to have practical benefits in terms of accessibility. It isn't just something the W3C is trying to impose on people arbitrarily.

And yet, the built-in features of HTML and CSS have a tendency of being very arbitrary. If you want a text box, you can have it; if you want a set of radio buttons, you can have it; if you want a set of radio buttons where one of them says "Other (please specify)" and has an associated text box, you suddenly have a significant amount of code to write. If you want to style the first letter of a paragraph, you can use a ::first-letter CSS selector... but if you want to style letters other than the first, you need to wrap them in explicit HTML elements, which unfortunately has other effects you might not want (like causing screen readers to treat each letter like it's a separate word).

Sometimes, there isn't a workaround. For instance, pages have titles, which appear in the top of the browser window. Have you ever seen a page with a title and a subtitle displayed just under it? I haven't, and short of finding a security hole that makes the browser execute arbitrary code, it seems pretty clear there isn't a way to do this.

Sometimes, there is a conceivable workaround, but it requires something like building your own text layout engine from scratch and then wrangling with a lot of obscure Unicode scripts, screen-reader troubles, text selection support, etc. There are some in-browser text editors which have to make this kind of effort just to achieve syntax highlighting.

And sometimes, there's a workaround that's a lot like building your own substantial subsystem of the browser, but it's actually fairly reasonable to do in a pinch. Like, there are a bunch of front-end frameworks for writing reactive UIs. They take in something that's pretty similar to DOM nodes (sometimes even obtained by parsing DOM nodes that aren't meant to be displayed to the user), they generate actual DOM nodes that are similar to those, and they modify those generated DOM nodes on the fly as the application state changes. In certain respects, these frameworks can save a lot of work by taking advantage of the underlying features of HTML... and in certain respects, there's extra work involved in inverting HTML's abstractions to get them to support this new indirect interaction style they weren't designed for.

Which brings me to another pg quote...

http://www.paulgraham.com/ilc03.html

pg: "But the advantage of a rewritable language is more than that it lets programmers fix your mistakes. I think the best programmers tend to work by rewriting whatever language they're using. So even the perfect language, if there is such a thing, would be very rewritable. In fact, if I had to guess, I think the perfect language might be whichever one was most rewritable."

How much code does it take for someone to implement their own "rewritten" variation of some HTML or CSS feature? Well, if they can't achieve their goal without writing their own text layout engine or their own virtual DOM framework, quite a lot.

This is important to Arc because it makes the program longer.

Arc is a language designed incrementally by starting with something Lispy and then making whatever changes will shorten Arc programs. News is a program that was written to put Arc to the test; the shorter its code is, the better Arc is doing.

If News used HTML or CSS features in very picky ways, it wouldn't be a good test: As soon as News's needs strayed slightly from the HTML and CSS features that browsers had built-in, then a heap of code would need to be written to make up the difference. When a slight discrepancy in the way a measuring instrument is consulted results in a large discrepancy in the measurement, it's not a very reliable measuring instrument.

So it seems to me that of all possible programs, News was pursued because it could get by on using HTML features in non-picky ways, pretty much in the ways they were already designed to be used. The use of HTML tables and transparent gif spacers was a well-known and once-popular, if dated, technique for achieving layout that was consistent across browsers, so pg built abstractions on that technique for News.

All that being said, I personally think it's a great improvement for News to use semantic HTML tags instead of tables, and I don't think this change really does that much to the size of the codebase (does it?). I just figure these pg writings are interesting in this context.

---

This has me thinking about html.arc....

The way html.arc is designed involves a lot of special-casing of specific HTML tags and attributes. It's almost like a full go-between layer abstracting HTML from Arc, which suggests that with some ambitious modifications to html.arc, it could turn into a DSL that compiles to HTML in a more indirect way (perhaps performing nonlocal transformations to implement things like footnotes or column breaks). This would potentially be a good place to hone the design of the HTML built-ins so that they're more abstraction-friendly and more "rewritable" as far as Arc code is concerned.

This has a lot in common with those front-end frameworks I mentioned. They abstract over and extend the features of HTML, and in doing so, they tend to make HTML's built-ins "rewritable" by exposing a new way for programmers to define their own extensions of the same kind.

-----

3 points by krapp 2239 days ago | link

>Tables are the lists of html

    (ノಠ益ಠ)ノ彡┻━┻ NO PG LISTS ARE ALREADY THE LISTS OF HTML
... sorry. Don't know what came over me there.

>and I don't think this change really does that much to the size of the codebase (does it?).

Most of it is the result of moving existing code around, so I think it comes out about even. I don't know how much of a performance issue macro expansion is but there is less of it in the new code, and the HTML itself should be simpler without tables.

>This has me thinking about html.arc....

Racket has its own xml/html library and there is an sml.arc which I haven't played with yet that seems like it might be capable. html.arc seems to do both too much and too little... the attributes blacklist makes it difficult to have modern features like data attributes, and the more macros there are, the more polluted the global namespace becomes.

I've sometimes thought it would be nice if Arc supported css and xml grammar natively, but I have no idea what it would take to actually support that. And I'm probably the only person here who wants to just write html and css directly, rather than use s-expressions or concatenating strings.

-----

2 points by i4cu 2239 days ago | link

> Tables are the lists of html. The W3C doesn't like you to use tables to do more than display tabular data because then it's unclear what a table cell means. But this sort of ambiguity is not always an error. It might be an accurate reflection of the programmer's state of mind. [...]

Note that it's traditional html tables pg is referencing. I could be wrong, but my understanding is that the 'display' properties options: 'table', 'table-row', 'table-cell' etc, were not well supported (in IE particularly) or even existent at the time pg wrote the code. So he may not believe the same now.

> The predefined libraries just do everything with tables. Why? Because Arc is tuned for exploratory programming, and the W3C-approved way of doing things represents the opposite spirit.

I don't agree. My understanding is that traditional tables are ridged, which is why they suggest you only put data into it. They are highlighting that it's, generally, not suitable for other things. Divs allow for more flexible manipulation. For example, I can create a table and then decide to break out of the table somewhere in the middle of it's content to render some component. Or I can make a table and morph it into something different by only changing it's properties via css/js.

But pg was not writing web apps. He was writing web pages and then calling a new page for, pretty much, ANY change. So from that perspective (where you can macro away server side) you can see why pg would just put it all in a table and highlight how using arc macros will let you do more compose-able things.

-----

3 points by i4cu 2238 days ago | link

FYI, obviously HN is a Web app, but I'm referring to the modern view of Web Apps where the workload is being done client side via javascript.

PS looks like my IP was banned. I think I made too many edits to a comment or whatever. So I probably will not be here for a while...

-----

4 points by shawn 2246 days ago | link

No such thing as necroing here. Post more Arc updates!

I'm thinking of starting a Discord server for Arc to give us a place to hang out.

-----

3 points by hjek 2287 days ago | link

> 4. All inline style attributes need to be removed and changes to news.css or news.js will need to be made in order to compensate.

Wat. Wow, browsers today! Is CSS vuln by default? Is that really necessary?

-----

3 points by i4cu 2287 days ago | link

Strict CSP settings are a form of whitelisting what js, css etc, is valid thus protecting from injection. Inline code for both js and css can't be whitelisted like header items can be so they will fail (unless you use the hash code hack mentioned for js).

Css is vulnerable too (since at least 2009):

https://scarybeastsecurity.blogspot.com/2009/12/generic-cros...

"By controlling a little bit of text in the victim domain, the attacker can inject what appears to be a valid CSS string. It does not matter what proceeds this CSS string: HTML, binary data, JSON, XML. The CSS parser will ruthlessly hunt down any CSS constructs within whatever blob is pulled from the victim's domain...."

Furthermore:

https://developer.mozilla.org/en-US/docs/Web/HTTP/CSP

"A policy needs to include a default-src or script-src directive to prevent inline scripts from running, as well as blocking the use of eval() . A policy needs to include a default-src or style-src directive to restrict inline styles from being applied from a <style> element or a style attribute."

So it's just the 'style' attribute people worry about and strict CSP manages.

-----

2 points by hjek 2284 days ago | link

Thanks for all those links. Not sure I "get" the browsers of today. Not sure I can be bothered manually adding hash-codes for inline JS (Maybe doable from Arc, but sounds hacky).

It will be difficult to deal with some styling functions from Arc, like `grayrange` that's greying out comments with negative score. Perhaps JS is more suitable?

I wonder in which file the CSP would need to be implemented in Arc, or whether it's easier to set them in an Nginx config.

-----

2 points by i4cu 2284 days ago | link

> ... but sounds hacky

That's because it is a hack (as mentioned in my original comment edit#1).

My comments are only intended provide whatever help I can towards the original posting context which suggested a strict CSP criteria.

None of these things have to be done. It's up to you to decide, so really the question becomes what are you doing it for? Are you building a news site for a community of a few thousand people in a niche group? or are you making a news app that others can buy into for their own product/uses? The latter would make me want to ensure it's CSP capable, while the former - not so much.

> It will be difficult to deal with some styling functions from Arc, like 'grayrange'...

I would just create 10 or 20 or whatever number of css entries that act as a segmented gradient (call them .color-reduct1 to .color-reduct10) then create a server side function that takes the output value of grayrange and picks one the css entries. Then add that class to the html element and you're good to go. It's not a perfect gradient but it would be enough that I doubt it would make any noticeable difference.

Js is also an option, but then you have to store and pass the score into the js calculation which requires much more work then the above solution. Plus it forces you to expose the score (which HN no longer does)

> I wonder in which file the CSP would need to be implemented in Arc, or whether it's easier to set them in an Nginx config.

If you want to make code that's generic and useable by others then it needs to be in arc (not everyone will use Nginx). I suggested using arc templates [1] already and I still think this is the right way go. Establish the base template definition in srv.arc and then each app can modify that base template from their app file. Additionally allowing defop to optionally pass in over-rides will make it dynamic if you need that variance.

I'm sure there are dozen ways to do it, but that's my suggestion anyway.

1. http://arclanguage.com/item?id=20730

-----

2 points by krapp 2284 days ago | link

> Not sure I can be bothered manually adding hash-codes for inline JS (Maybe doable from Arc, but sounds hacky).

No one bothers, everyone moves all of their JS to an external file (which is what i'm working on now) or they just don't bother with CSP headers at all.

>It will be difficult to deal with some styling functions from Arc, like `grayrange` that's greying out comments with negative score. Perhaps JS is more suitable?

The score for each comment could be added as a data attribute and JS could apply the style based on that. Offloading that to JS might make the forum more responsive.

...as well as, maybe, having markdown done entirely in JS, but that's for the future.

[edit] ... as well as maybe thread folding with JS and localstorage.

-----