Jump to content

Talk:Cross-site leaks

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
Former featured article candidateCross-site leaks is a former featured article candidate. Please view the links under Article milestones below to see why the nomination was archived. For older candidates, please check the archive.
Good articleCross-site leaks has been listed as one of the Engineering and technology good articles under the good article criteria. If you can improve it further, please do so. If it no longer meets these criteria, you can reassess it.
Did You Know Article milestones
DateProcessResult
November 6, 2023Good article nomineeNot listed
November 11, 2023Good article nomineeListed
February 7, 2024Guild of Copy EditorsCopyedited
February 10, 2024Peer reviewReviewed
March 26, 2024Featured article candidateNot promoted
May 10, 2024Peer reviewReviewed
Did You Know A fact from this article appeared on Wikipedia's Main Page in the "Did you know?" column on November 14, 2023.
The text of the entry was: Did you know ... that cross-site leaks can be used to gain information about your web browsing habits?
Current status: Former featured article candidate, current good article

Did you know nomination

[edit]
The following is an archived discussion of the DYK nomination of the article below. Please do not modify this page. Subsequent comments should be made on the appropriate discussion page (such as this nomination's talk page, the article's talk page or Wikipedia talk:Did you know), unless there is consensus to re-open the discussion at this page. No further edits should be made to this page.

The result was: promoted by PrimalMustelid talk 01:51, 6 November 2023 (UTC)[reply]

5x expanded by Sohom Datta (talk). Self-nominated at 07:39, 2 October 2023 (UTC). Post-promotion hook changes for this nom will be logged at Template talk:Did you know nominations/Cross-site leaks; consider watching this nomination, if it is successful, until the hook appears on the Main Page.[reply]

  • Article was created on 31 August but DYK check tool shows expansion in October so no issues there. Copyvio detector checks out and hook is short enough and interesting enough. QPQ not needed for user's first nomination. I do have a few comments, however:
  • I can't help thinking given the hyphen usage in the sources that the article should really live at Cross-site leaks.
  • The article needs a good copy-edit before featuring on the main page - there are several spelling/grammatical errors (e.g. orgin/origin and users/user's) and inconsistencies (e.g. url and URL).
  • Hook fact does appear in the article and is cited, although not using the reference provided here (which as a Wiki page wouldn't count as a reliable source anyway), but the sources in the article are journals so that's fine
  • Sourcing meets the minimum one per paragraph, however I don't see what makes appsecmonkey.com a reliable source - can you provide a better reference?
BigDom I've fixes the issues that you mentioned:
  • I can't help thinking given the hyphen usage in the sources that the article should really live at Cross-site leaks.
  • Moved the page to the correct hyphenated usage.
  • The article needs a good copy-edit before featuring on the main page - there are several spelling/grammatical errors (e.g. orgin/origin and users/user's) and inconsistencies (e.g. url and URL).
  • Gave the article a copyedit via spell check software.
  • Hook fact does appear in the article and is cited, although not using the reference provided here (which as a Wiki page wouldn't count as a reliable source anyway), but the sources in the article are journals so that's fine
  • I have replaced the cite with a citation to a paper submitted for the SecWeb workshop, however, XSleak wiki tends to be fairly reliable in the field and is cited by multiple journal papers that go into depth about the topic.
  • Sourcing meets the minimum one per paragraph, however, I don't see what makes appsecmonkey.com a reliable source - can you provide a better reference?
  • Ditto as above.
-- Sohom (talk) 04:28, 11 October 2023 (UTC)[reply]
@Sohom Datta: Thanks for the edits, it's looking even better now. I made a couple of very small edits myself (adding a missing apostrophe and wikilinking web application; hope this isn't a problem). Anyway, happy to support this now. BigDom (talk) 04:49, 11 October 2023 (UTC)[reply]
Noting that the article has been rewritten and expanded by a lot in the last few days, @BigDom maybe you could take a look at it once more (just in case) -- Sohom (talk) 10:08, 31 October 2023 (UTC)[reply]

GA Review

[edit]
This review is transcluded from Talk:Cross-site leaks/GA1. The edit link for this section can be used to add comments to the review.

Reviewer: Equalwidth (talk · contribs) 09:44, 6 November 2023 (UTC)[reply]


GA review
(see here for what the criteria are, and here for what they are not)
  1. It is reasonably well written.
    a (prose, spelling, and grammar):
    b (MoS for lead, layout, word choice, fiction, and lists):
  2. It is factually accurate and verifiable.
    a (references):
    b (citations to reliable sources):
    c (OR):
    d (copyvio and plagiarism):
  3. It is broad in its coverage.
    a (major aspects):
    b (focused):
  4. It follows the neutral point of view policy.
    Fair representation without bias:
  5. It is stable.
    No edit wars, etc.:
  6. It is illustrated by images, where possible and appropriate.
    a (images are tagged and non-free images have fair use rationales):
    b (appropriate use with suitable captions):

Overall:
Pass/Fail:

· · ·
I disagree with this assessment, I do not believe that there is any original research and the images have been appropriately tagged with the correct licensing. Additionally, while not all of the aspects of the topic have been comprehensively covered, I believe they are sufficient to provide enough context to lay per into the topic. Similarly, while the prose might not be GA standard, I don't think they are far enough away to merit a quick fail. Due to this, I have renominated the article. -- Sohom (talk) 12:17, 6 November 2023 (UTC)[reply]

GA Review

[edit]
This review is transcluded from Talk:Cross-site leaks/GA2. The edit link for this section can be used to add comments to the review.

Reviewer: RoySmith (talk · contribs) 23:05, 7 November 2023 (UTC)[reply]

@Sohom Datta: Starting review now. Just for your information, I'm broadly familiar with web security, but not an expert in this particular topic. RoySmith (talk) 23:05, 7 November 2023 (UTC)[reply]

PS, I see you're already working on this, which is great. To reduce confusion, all my comments will be against Special:Permalink/1183951373. RoySmith (talk) 00:23, 8 November 2023 (UTC)[reply]
Sounds good, I'm going through the more easier ones to start with, and will look into the more complicated ones(for example defining a execution context) Sohom (talk) 01:01, 8 November 2023 (UTC)[reply]
@RoySmith I've gone ahead and addressed most of the issues. Wrt to the code snippet issue, as mentioned below, my understanding is that the license under which the paper was published allows for the reuse of the code with attribution. Let me know if there are any other concerns with the current version :) Sohom (talk) 21:31, 8 November 2023 (UTC)[reply]
Sounds good. I'll try check in on this tomorrow and hopefully wrap this up in the next couple of days. My one question at this point, now that I know the cited paper is CC-BY 4.0, is why you made the minor changes to the code snippets that you did? Why not just show the code exactly as it's shown in the paper? RoySmith (talk) 00:49, 9 November 2023 (UTC)[reply]
Most of the changes are to make the code more closer to how a actual vulnerablity exploit might look and some are to make the code fit inside the 20em space. For example, for a cross-origin page the { mode: 'no-cors' } values must be included for the request to succeed. Similarly, the comments at the top of the JS code snippet demonstrate that the attacker needs to create a empty iframe before setting the load event handler. (A non-empty iframe might not fire a load event in certain browsers). Some of the URLs have been shortened so that it fits inside the sidebar as well as the performance.now() statements rearranged for the same reason. Sohom (talk) 09:21, 9 November 2023 (UTC)[reply]
I only have a few minutes before I need to run, but I see several problems here. As I noted above, I'm not a web security expert, but I do know enough to understand that subtle changes to code sometimes have profound effects, especially when you're looking at ephemeral things like execution timings and the observable effects of caching.
Basically you're saying, "Trust me, the changes I made won't make a difference", which is WP:OR. I feel strongly that if you're going to present these snippets, you should present them exactly as in the paper, down to the placement of line breaks and semicolons, since (mind-bogglingly) this is significant in javascript.
As for making the code fit into a 20em space, I wouldn't do that either. You don't know what display hardware the end-user will have. They could be viewing it on a mobile phone. Then could be listening to it with a screen reader. The more you try to fine-tune the display with the fancy wikitable formatting, the more chance you're just making it worse for users not using the same kind of display you're using. I would just run it in-line with the main text instead of trying to force it to the margin with floatright. Using syntaxhighlight as you're doing is good, as is the nowrap directive, but mostly just let the browser handle the layout. RoySmith (talk) 14:40, 9 November 2023 (UTC)[reply]
I disagree with it being called original research, most of the changes are within reason and could be cited to specific JS standards etc. (in fact the code given in the paper is invalid if run without any changes and will throw a syntactic error ☹). However, I do understand your concerns with changing the code and have changed it to be the exact same as copied from the paper :) Sohom (talk) 15:05, 9 November 2023 (UTC)[reply]
@RoySmith I ended up creating a new section for the example and putting the example in there (borrowing from how Rust (programming language) does things). I have added some new text, but for the most part, the code remains the same except for 1 small change (the addition of async) that fixes a syntactical issue with the code in the paper. Let me know if there are any other issues. Sohom (talk) 11:02, 11 November 2023 (UTC)[reply]
My apologies for not getting back to this sooner. In any case, I like the current presentation. The fact that the published paper shows code which is invalid did throw me for a loop. I'm OK with the change you made, but please add a note somewhere explaining why the code you're presenting is different from what's in the source. Consider the case of a reader trying to follow along in the code; when they get to where your code differs from the cited paper's, they won't understand why it doesn't match up unless you explain what's going on. RoySmith (talk) 12:09, 11 November 2023 (UTC)[reply]
@RoySmith Added a note at the top of the snippet pointing out that a minor modification was made (and why and what it is) Sohom (talk) 12:24, 11 November 2023 (UTC)[reply]

Lead

[edit]
  • The two hatnotes (see Cross-site scripting and see Cross site request forgery) seem contrary to WP:RELATED. I use hatnotes like this when it's likely somebody would have guessed at an article title and ended up in the wrong place. I don't think anybody looking for either of those would have typed "cross-site leaks" into a search box. This isn't a WP:GACR, but consider if you could handle this better in the body.
  • MOS:LEAD says Apart from basic facts, significant information should not appear in the lead if it is not covered in the remainder of the article. Things I see in the lead that aren't mentioned anywhere in the article include "XS-Leaks", "browsing session", "side channel", "cache timing information"
  • first discovered by researchers at Purdue University it's hard to prove something was the first. The body says as far back as 2000 which is a better way to phrase this, since it allows that there may have been earlier papers.

Background

[edit]
  • two primary components: a web browser and multiple web servers. "one or more web servers"?
  • via the HTTP protocol and socket connections that's usually true, but doesn't have to be. I'd throw "typically" in there somewhere. You could say that the rest of this article assumes that's the case.
  • render a web application I'm not sure "render" is the right verb here. Deliver? Implement?
  • executing HTML, CSS or Javascript You certainly execute javascript. I'm not sure "execute" is the right verb for HTML or CSS, however.
  • transitions in between, just "transitions between", I think?
  • These states are often synced..., I'm not sure what this is trying to say, but "synced" doesn't seem like the right word.
  •  Fixed Removed that sentence, I don't see it in the source and I think I might have confused this with a source that was talking about only COSI attacks.
  • To provide isolation and security of maybe, "To securely isolate"?
  • You should describe what an "execution context" is.
  •  Partly done I'm not to happy with the way I (and the research papers) have done this, since it effectively offloads the definition to the concept of a web origin, which I wish was a blue link :( I'm gonna try and see if I can simplify this. Sohom (talk) 12:20, 8 November 2023 (UTC)[reply]
  • A specific web application drop "specific"
  • cannot reach into a different web app's execution context I think you mean "cannot reach into another execution context". If I've got two windows open on the same URL, it's the same web app, but different execution contexts.
  • arbitrarily gain information that's an odd phrase. Is "gain" the word that's used in the sources? If not, then maybe "learn" or "obtain", or even "infer" might be better?
  • The sources say "interact" which is what I am going to go with here. I also did a bit of cleanup in the citations since one sentence in the middle should have been cited to the XSinator paper. Sohom (talk) 18:47, 8 November 2023 (UTC)[reply]
  • Define what you mean by "attacker origin" and "victim origin".
  • This can lead to the attacker accessing sensitive information about a user's previous browsing habits. "activity" instead of "habits"? And any information you get that you're not supposed to have is inherently "sensitive".

Mechanism

[edit]

(I'm going to be on the road for most of the next week. I'll drop in on this as I have time and connectivity, but it may stretch out for the better part of the week)

  • relies on the attacker being able to ... under the adversary's control. I'm pretty sure you're using "attacker" and "adversary" as synonyms here, referring to the same actor. Normally in creative writing you want to use varied vocabulary to keep the prose interesting. In technical writing like this, I think you'll do better to stick to a fixed vocabulary, i.e. pick one of "attacker" or "adversary" (or whatever) and use that term consistently, in the style of Alice and Bob. The writing will sound a bit more stilted, but it'll be easier for a reader to follow.
  • by phishing the user to a web page link "phishing"
  • While every method of including a URL in a web page can be combined ... I think you mean "can in theory be combined"?

History

[edit]
  • known for over 23 years that will become stale next year. I'd just use the year, i.e. "have been known since 2000"
  • papers ... that describe attacks that leverage the HTTP cache were these attacks theoretical, or actual attacks observed in the wild?
  • I believe the attacks were theoretical at the time the paper was published (their research paper goes into how a attacker might measure the timing differences), however, they have since been used in the wild, most notably by terjanq in 2019.
  • Bar Ilan University detail a attack detailed (past tense)?
  • Explain what an "amplification attack" is
  • Link Christopher Evans -> Christopher Evans (computer scientist)?
    • Correlary to the above; take every person you mention and see if we have an article about them, in which case link to it.
  • I did try, but it seems like a lot of these people aren't well covered on Wikipedia. I'm gonna try and see if I have enough sourcing to create atleast a start article about some of these people (Chris Evans and Luan Herrara in particular) after this review.

Defences

[edit]
  • Is it "Defences" or "Defenses"? The nav box uses the later, as does one of the sources you cite.
  • this approach was infeasible for any non-trivial website due to the nature of the web platform. you need to explain that.
  • are extensions to the HTTP protocol that focuses "focus"?
  • I'm not sure what to do with the two code snippets. They are clearly copied from Goethem et al, but with small changes (h1 instead of h2, etc). I'm going to ask around to see how that plays with copyright restrictions.
  • Van Goethem et. al. should be under the CC-BY 4.0 license based on the notice on the first page of the paper which according to my understanding should be compatible to the one used by Wikipedia, but I think a third-opinion would be great in this case as well :) Sohom (talk) 17:38, 8 November 2023 (UTC)[reply]

Preventing state changes

[edit]
  • X-Frame-Options header more SEAOFBLUE

Making cross-origin requests stateless

[edit]
  • "LAX+POST" in code style?

Completely isolating shared resources

[edit]
  • One of the earliest and most well-known methods... if this was the earliest, maybe discuss it first?
  • major browser vendors such as the likes of Chrome, Brave I would drop the entire "major browser vendors such as the likes of" part.

Sources

[edit]
  • I do agree with this in general, but I think the usecase here is that of the author is a subject-matter expert or the blog is used for uncontroversial self-descriptions. Luan's blog is cited in the paper Van Goethem et. al. to describe his exploit on the monorail bug tracker and the line this is immediately before talks about exactly that. The only real "thing" that this citation supports is the products in which he found the security issues in which falls under uncontroversial self-descriptions. -- Sohom (talk) 12:37, 11 November 2023 (UTC)[reply]
OK, that works. I can't find anything else to complain about, good job! RoySmith (talk) 14:09, 11 November 2023 (UTC)[reply]

Other WP:GACR

[edit]
  • No copyright problems found
  • Article looks to be adequately sourced with in-line citations and except as noted above (Medium), to RS.
  • Breadth of coverage is appropriate.
  • No problems noted with neutrality or stability.
  • Illustrated with appropriately licensed media.

Spelling of Defenses

[edit]

@Rockstone35 I was the author of the revision where the wrong spelling of Defenses was used, and as discussed in the GA review above, I corrected that the spelling since most of the sources are in American English. I really don't understand why we need to bring WP:RETAIN into this. -- Sohom (talk) 15:02, 14 November 2023 (UTC)[reply]

Hi Sohom.
The body of the article uses British spelling and all older non stub revisions also do. I'm fine with using American spellings if no one else objects, but it does violate WP:RETAIN. -- RockstoneSend me a message! 16:03, 14 November 2023 (UTC)[reply]
After sleeping on it a bit, I honestly don't think it matters so status quo is fine. Sorry for my interactions yesterday, as I mentioned I was agitated about something else. Sohom (talk) 10:46, 15 November 2023 (UTC)[reply]

About italicization of "et al."

[edit]

@Baffle gab1978: Thanks for your efforts in copyediting this article! I just wanted to leave a quick note that "et al." is not normally abbreviated per MOS:MISCSHORT/MOS:LATINABBR. TechnoSquirrel69 (sigh) 03:07, 6 February 2024 (UTC)[reply]

@TechnoSquirrel69:, thanks for pointing this out. Et alia is abbreviated et al in the article, I haven't added the full version. I must have missed: In normal usage, abbreviations of Latin words and phrases should be italicised, except AD, c., e.g., etc., i.e., and a few others not in italics in the table above; these ones have become ordinary parts of the English language.. This batty rule is quite new to me, it was arbitrarily added here but I've always written et al. in Wikipedia but I'll stop in this article. Cheers, Baffle☿gab 23:56, 6 February 2024 (UTC)[reply]

Reading

[edit]

More cool research :) https://autoleak.org/paper.pdf Sohom (talk) 05:44, 29 April 2024 (UTC)[reply]

Notes/thoughts

[edit]
  • Fixed a few of the issues brought up in old PR.
  • There doesn't seem to be a lot of scholarship on the impact of 3p cookie deprecation, do we leave that as is? (Imo it is expected to have a fair bit of impact, but if there is no coverage from Rs then we probably shouldn't include it?)
  • After reading autoleak (linked above) I didn't find any new info that merit's inclusion
  • There are a few Privacy Sandbox cross-site leakages in the fledging paper is it worth shoe-horning it somewhere ?

Sohom (talk) 06:21, 3 July 2024 (UTC)[reply]