Modify <meta> tag with JS (chrome extension) on response receiving

Modify <meta> tag with JS (chrome extension) on response receiving - javascript

I have a Chrome extension that adds a panel to the page in the floating iframe (on extension button click). There's certain JS code that is downloaded from 3rd party host and needs to be executed on that page. Obviously there's XSS issue and extension needs to comply with content security policies for that page.
Previously I had to deal with CSP directives that are passed via request headers, and was able to override those via setting a hook in chrome.webRequest.onHeadersReceived. There I was adding my host URL to content-security-policy headers. It worked. Headers were replaced, new directives applied to the page, all good.
Now I discovered websites that set the CSP directives via <meta> tag, they don't use request headers. For example, app pages in iTunes https://itunes.apple.com/us/app/olympics/id808794344?mt=8 have such. There is also an additional meta tag with name web-experience-app/config/environment (?) that somewhat duplicates the values that are set in content of tag with http-equiv="Content-Security-Policy".
This time I am trying to add my host name into meta tag inside chrome.webNavigation.onCommitted or onCompleted events listeners (JS vanilla via chrome.tabs.executeScript). I also experimented with running the same code from the webrequest's onCompleted listener (at the last step of lifecycle according to https://developer.mozilla.org/en-US/Add-ons/WebExtensions/API/webRequest).
When I inspect the page after its load - I see the meta tags have changed. But when I click on my extension to start loading iframe and execute JS - console prints the following errors:
Refused to frame 'https://myhost.com' because it violates the following Content Security Policy directive: "frame-src 'self' *.apple.com itmss: itms-appss: itms-bookss: itms-itunesus: itms-messagess: itms-podcasts: itms-watchs: macappstores: musics: apple-musics:".
I.e. my tag update was not effective.
I have several questions: first, do I do it right? Am I doing the update at the proper event? When is the data content from meta tags being read in the page lifecycle? Will it be auto-applied after tag content change?

As of March 2018 Chromium doesn't allow to modify the responseBody of the request. https://bugs.chromium.org/p/chromium/issues/detail?id=487422#c29
"WebRequest API: allow extensions to read response body" is a ticket from 2015. It is not on a path of getting to be resolved and needs some work/help.
--
Firefox has the webRequest filter implementation that allows to modify the response body before the page's meta directives are applied.
https://developer.mozilla.org/en-US/Add-ons/WebExtensions/API/webRequest/filterResponseData
BUT, my problem is focused on fixing the Chrome extension. Maybe Chrome picks this up one day.
--
In general, the Chrome's extension building framework seems like not a reliable path of building a long term living software; with browser vendors changing the rules frequently, reacting to newly discovered threats, having no up-to-date supported cross-browser standard.
--
In my case, the possible way around this issue can be to throw all the JS code into the extension's source base. Such that there's no 3-rd party to connect to fetch and execute the JS (and conflict with/violate the CSP rules). Haven't explored this yet, as I expected to reuse the code & interactive components I am using in my main browser application.

I've been interested in the same things and here are a few aspects that perhaps could help:
chrome.debugger extension API with Fetch (or Network) domain can be used to modify responseBody: https://chromedevtools.github.io/devtools-protocol/tot/Fetch/
This is an example implementation: https://github.com/mr-yt12/Debugger-API-Fetch-example-Chrome-Extension
However, I'm facing a problem of Fetch.requestPaused not firing on the first page load: Chrome Extension Debugger API, Fetch domain attaches/enables too late for the response body to be intercepted
And I haven't found a solution to this yet, besides first redirecting the request to 'http://google.com/gen_204' and then updating the tab. But this creates flicker and also I'm not sure if it's possible to redirect the request like this with manifest v3.
When using debugger API, Chrome shows a warning at the top, which changes the page's size and doesn't go away (perhaps it goes away 5 seconds after the debugger is detached and also if the user clicks "cancel"). This means that it's mostly only good for personal use or when distributing as a developer mode extension. Using --silent-debugger-extension-api flag with Chrome disables this warning.
I've tried injecting my script at document_start (when the meta tag is not yet created). Then I used mutationObserver (also tried other methods) to wait for the meta tag, to modify it before it's applied or fully created. Somehow it succeeded once or a few times, but could be a coincidence or a wrong interpretation of results. Perhaps it's worth experimenting with it.
Another idea (but I think I didn't make it work, but perhaps it's possible) is to use window.stop() at document_start and then rewrite the html content programmatically. This needs more researching.
It seems that meta tag CSP is applied once the meta tag is created (or while it's being created), and then there is no way to cancel what's been applied. It should be researched more on how to prevent it from applying or modifying it before it's fully created or applied.

Related

a website that is blocked from being embedded in an iFrame. I need a legal workaround that allows me to post my store on my website [duplicate]

I am developing a web page that needs to display, in an iframe, a report served by another company's SharePoint server. They are fine with this.
The page we're trying to render in the iframe is giving us X-Frame-Options: SAMEORIGIN which causes the browser (at least IE8) to refuse to render the content in a frame.
First, is this something they can control or is it something SharePoint just does by default? If I ask them to turn this off, could they even do it?
Second, can I do something to tell the browser to ignore this http header and just render the frame?

If the 2nd company is happy for you to access their content in an IFrame then they need to take the restriction off - they can do this fairly easily in the IIS config.
There's nothing you can do to circumvent it and anything that does work should get patched quickly in a security hotfix. You can't tell the browser to just render the frame if the source content header says not allowed in frames. That would make it easier for session hijacking.
If the content is GET only you don't post data back then you could get the page server side and proxy the content without the header, but then any post back should get invalidated.

UPDATE: 2019-12-30
It seem that this tool is no longer working! [Request for update!]
UPDATE 2019-01-06: You can bypass X-Frame-Options in an <iframe> using my X-Frame-Bypass Web Component. It extends the IFrame element by using multiple CORS proxies and it was tested in the latest Firefox and Chrome.
You can use it as follows:
(Optional) Include the Custom Elements with Built-in Extends polyfill for Safari:
<script src="https://unpkg.com/#ungap/custom-elements-builtin"></script>
Include the X-Frame-Bypass JS module:
<script type="module" src="x-frame-bypass.js"></script>
Insert the X-Frame-Bypass Custom Element:
<iframe is="x-frame-bypass" src="https://example.org/"></iframe>

The X-Frame-Options header is a security feature enforced at the browser level.
If you have control over your user base (IT dept for corp app), you could try something like a greasemonkey script (if you can a) deploy greasemonkey across everyone and b) deploy your script in a shared way)...
Alternatively, you can proxy their result. Create an endpoint on your server, and have that endpoint open a connection to the target endpoint, and simply funnel traffic backwards.

Yes Fiddler is an option for me:
Open Fiddler menu > Rules > Customize Rules (this effectively edits CustomRules.js).
Find the function OnBeforeResponse
Add the following lines:
oSession.oResponse.headers.Remove("X-Frame-Options");
oSession.oResponse.headers.Add("Access-Control-Allow-Origin", "*");
Remember to save the script!

As for second question - you can use Fiddler filters to set response X-Frame-Options header manually to something like ALLOW-FROM *. But, of course, this trick will work only for you - other users still won't be able to see iframe content(if they not do the same).

Is there an alternative to preprocessorScript for Chrome DevTools extensions?

I want to create a custom profiler for Javascript as a Chrome DevTools Extension. To do so, I'd have to instrument all Javascript code of a website (parse to AST, inject hooks, generate new source). This should've been easily possible using chrome.devtools.inspectedWindow.reload() and its parameter preprocessorScript described here: https://developer.chrome.com/extensions/devtools_inspectedWindow.
Unfortunately, this feature has been removed (https://bugs.chromium.org/p/chromium/issues/detail?id=438626) because nobody was using it.
Do you know of any other way I could achieve the same thing with a Chrome Extension? Is there any other way I can replace an incoming Javascript source with a changed version? This question is very specific to Chrome Extensions (and maybe extensions to other browsers), I'm asking this as a last resort before going a different route (e.g. dedicated app).

Use the Chrome Debugging Protocol.
First, use DOMDebugger.setInstrumentationBreakpoint with eventName: "scriptFirstStatement" as a parameter to add a break-point to the first statement of each script.
Second, in the Debugger Domain, there is an event called scriptParsed. Listen to it and if called, use Debugger.setScriptSource to change the source.
Finally, call Debugger.resume each time after you edited a source file with setScriptSource.
Example in semi-pseudo-code:
// Prevent code being executed
cdp.sendCommand("DOMDebugger.setInstrumentationBreakpoint", {
eventName: "scriptFirstStatement"
});
// Enable Debugger domain to receive its events
cdp.sendCommand("Debugger.enable");
cdp.addListener("message", (event, method, params) => {
// Script is ready to be edited
if (method === "Debugger.scriptParsed") {
cdp.sendCommand("Debugger.setScriptSource", {
scriptId: params.scriptId,
scriptSource: `console.log("edited script ${params.url}");`
}, (err, msg) => {
// After editing, resume code execution.
cdg.sendCommand("Debugger.resume");
});
}
});
The implementation above is not ideal. It should probably listen to the breakpoint event, get to the script using the associated event data, edit the script and then resume. Listening to scriptParsed and then resuming the debugger are two things that shouldn't be together, it could create problems. It makes for a simpler example, though.

On HTTP you can use the chrome.webRequest API to redirect requests for JS code to data URLs containing the processed JavaScript code.
However, this won't work for inline script tags. It also won't work on HTTPS, since the data URLs are considered unsafe. And data URLs are can't be longer than 2MB in Chrome, so you won't be able to redirect to large JS files.
If the exact order of execution of each script isn't important you could cancel the script requests and then later send a message with the script content to the page. This would make it work on HTTPS.
To address both issues you could redirect the HTML page itself to a data URL, in order to gain more control. That has a few negative consequences though:
Can't reload page because URL is fixed to data URL
Need to add or update <base> tag to make sure stylesheet/image URLs go to the correct URL
Breaks ajax requests that require cookies/authentication (not sure if this can be fixed)
No support for localStorage on data URLs
Not sure if this works: in order to fix #1 and #4 you could consider setting up an HTML page within your Chrome extension and then using that as the base page instead of a data URL.
Another idea that may or may not work: Use chrome.debugger to modify the source code.

Inject content script into iFrame across domains (but with optional permissions)? [duplicate]

Content Script can be injected programatically or permanently by declaring in Extension manifest file. Programatic injection require host permission, which is generally grant by browser or page action.
In my use case, I want to inject gmail, outlook.com and yahoo mail web site without user action. I can do by declaring all of them manifest, but by doing so require all data access to those account. Some use may want to grant only outlook.com, but not gmail. Programatic injection does not work because I need to know when to inject. Using tabs permission is also require another permission.
Is there any good way to optionally inject web site?

You cannot run code on a site without the appropriate permissions. Fortunately, you can add the host permissions to optional_permissions in the manifest file to declare them optional and still allow the extension to use them.
In response to a user gesture, you can use chrome.permission.request to request additional permissions. This API can only be used in extension pages (background page, popup page, options page, ...). As of Chrome 36.0.1957.0, the required user gesture also carries over from content scripts, so if you want to, you could add a click event listener from a content script and use chrome.runtime.sendMessage to send the request to the background page, which in turn calls chrome.permissions.request.
Optional code execution in tabs
After obtaining the host permissions (optional or mandatory), you have to somehow inject the content script (or CSS style) in the matching pages. There are a few options, in order of my preference:
Use the chrome.declarativeContent.RequestContentScript action to insert a content script in the page. Read the documentation if you want to learn how to use this API.
Use the webNavigation API (e.g. chrome.webNavigation.onCommitted) to detect when the user has navigated to the page, then use chrome.tabs.executeScript to insert the content script in the tab (or chrome.tabs.insertCSS to insert styles).
Use the tabs API (chrome.tabs.onUpdated) to detect that a page might have changed, and insert a content script in the page using chrome.tabs.executeScript.
I strongly recommend option 1, because it was specifically designed for this use case. Note: This API was added in Chrome 38, but only worked with optional permissions since Chrome 39. Despite the "WARNING: This action is still experimental and is not supported on stable builds of Chrome." in the documentation, the API is actually supported on stable. Initially the idea was to wait for a review before publishing the API on stable, but that review never came and so now this API has been working fine for almost two years.
The second and third options are similar. The difference between the two is that using the webNavigation API adds an additional permission warning ("Read your browsing history"). For this warning, you get an API that can efficiently filter the navigations, so the number of chrome.tabs.executeScript calls can be minimized.
If you don't want to put this extra permission warning in your permission dialog, then you could blindly try to inject on every tab. If your extension has the permission, then the injection will succeed. Otherwise, it fails. This doesn't sound very efficient, and it is not... ...on the bright side, this method does not require any additional permissions.
By using either of the latter two methods, your content script must be designed in such a way that it can handle multiple insertions (e.g. with a guard). Inserting in frames is also supported (allFrames:true), but only if your extension is allowed to access the tab's URL (or the frame's URL if frameId is set).

I advise against using declarativeContent APIs because they're deprecated and buggy with CSS, as described by the last comment on https://bugs.chromium.org/p/chromium/issues/detail?id=708115.
Use the new content script registration APIs instead. Here's what you need, in two parts:
Programmatic script injection
There's a new contentScripts.register() API which can programmatically register content scripts and they'll be loaded exactly like content_scripts defined in the manifest:
browser.contentScripts.register({
matches: ['https://your-dynamic-domain.example.com/*'],
js: [{file: 'content.js'}]
});
This API is only available in Firefox but there's a Chrome polyfill you can use. If you're using Manifest v3, there's the native chrome.scripting.registerContentScript which does the same thing but slightly differently.
Acquiring new permissions
By using chrome.permissions.request you can add new domains on which you can inject content scripts. An example would be:
// In a content script or options page
document.querySelector('button').addEventListener('click', () => {
chrome.permissions.request({
origins: ['https://your-dynamic-domain.example.com/*']
}, granted => {
if (granted) {
/* Use contentScripts.register */
}
});
});
And you'll have to add optional_permissions in your manifest.json to allow new origins to be requested:
{
"optional_permissions": [
"*://*/*"
]
}
In Manifest v3 this property was renamed to optional_host_permissions.
I also wrote some tools to further simplify this for you and for the end user, such as
webext-domain-permission-toggle and webext-dynamic-content-scripts. They will automatically register your scripts in the next browser launches and allow the user the remove the new permissions and scripts.

Since the existing answer is now a few years old, optional injection is now much easier and is described here. It says that to inject a new file conditionally, you can use the following code:
// The lines I have commented are in the documentation, but the uncommented
// lines are the important part
//chrome.runtime.onMessage.addListener((message, callback) => {
// if (message == “runContentScript”){
chrome.tabs.executeScript({
file: 'contentScript.js'
});
// }
//});
You will need the Active Tab Permission to do this.

Can I use indexeddb across subdomains?

I'm building a Chrome extension and using the db.js wrapper to utilize the indexeddb. The problem is, I've got several subdomains and I'd like to be able to share the information across them.
When I use the Chrome Dev tools to view Resources, all of the individual subdomains have their own copy of the schema I'm creating, and each has it's own data.
The only thing I knew to try was to set the document.domain but that didn't help. I wasn't surprised.
Documentation on indexeddb is very slim it seems. I keep finding the same 2 or 3 blog posts copied word for word in several different blogs and nothing specifies that this is possible or impossible.

You can't access the same database from multiple subdomains, the access scope is limited to html origin.
html_Origin = protocol + "://" + hostname + ":" + port + "/";

As #Xan mentioned, if you can use a common origin owned by the extension itself, rather than by the content pages, that sounds like it would be by far the easiest solution. If for whatever reason you can't do that (or for readers who got here wanting to know about regular page javascript or Greasemonkey-style userscripts, rather than extensions), the answer is:
Yes, though it's a slightly awkward and takes some work:
Since you're using a number of related subdomains, (rather than completely unrelated domains), there's a technique you can use in that situation. It can be applied to IndexedDB, localStorage, SharedWorker, BroadcastChannel, etc, all of which offer shared functionality between same-origin pages, but for some reason don't respect modifications to document.domain.
(1) Pick one "main" subdomain to for the data to belong to. i.e. if your subdomains are https://a.example.com, https://b.example.com, and https://c.example.com, you might choose to have your IndexedDB database stored under the https://a.example.com subdomain.
(2) Use it normally from all the the https://a.example.com pages.
(3) On https://b.example.com and https://c.example.com, use javascript to set document.domain = "example.com";. Then also create a hidden <iframe>, and navigate it to some page on the https://a.example.com domain (It doesn't matter what page, as long as you can insert a very little snippet of javascript on there. If you're creating the site, just make an empty page specifically for this purpose. If you're writing an extension or a userscript and so don't have any control over pages on the example.com server, just pick the most lightweight page you can find and insert your script into it. Some kind of "not found" page would probably be fine).
(4) The script on the hidden iframe page need only (a) set document.domain = "example.com";, and (b) notify the parent window when this is done. After that, the parent window can access the iframe window and all its objects without restriction! So the minimal iframe page is something like:
<!doctype html>
<html>
<head>
<script>
document.domain = "example.com";
window.parent.iframeReady(); // function defined & called on parent window
</script>
</head>
<body></body>
</html>
If writing a userscript, you might not want to add externally-accessible functions such as iframeReady() to your unsafeWindow, so instead a better way to notify the main window userscript might be to use a custom event:
window.parent.dispatchEvent(new CustomEvent("iframeReady"));
Which you'd detect by adding a listener for the custom "iframeReady" event to your main page's window.
(5) Once the hidden iframe has informed its parent window that it's ready, script in the parent window can just use iframe.contentWindow.indexedDB, iframe.contentWindow.localStorage, iframe.contentWindow.BroadcastChannel, iframe.contentWindow.SharedWorker instead of window.indexedDB, window.localStorage etc. ...and all these objects will be scoped to the https://a.example.com origin - so they'll have the this same shared origin for all of your pages!
The "awkward" part of this technique is mostly that you have to wait for the iframe to load before proceeding. So you can't just blithely initialize IndexedDB in your DOMContentLoaded handler, for example. Also you might want to add some error handling to detect if the hidden iframe fails to load correctly.
Obviously, you should also make sure the hidden iframe is not removed or navigated during the lifetime of your page... OTOH I don't know what the result of that would be, but very likely bad things would happen.
And, a caveat: setting/changing document.domain can be blocked using the Feature-Policy header, in which case this technique will not be usable as described.
However, there is a significantly more-complicated generalization of this technique, that can't be blocked by Feature-Policy, and that also allows entirely unrelated domains to share data, communications, and shared workers (i.e. not just subdomains off a common superdomain). #Xan alludes to it in point (2) of his answer:
The general idea is that, just as above, you create a hidden iframe to provide the correct origin for access; but instead of then just grabbing the iframe window's properties directly, you use script inside the iframe to do all of the work, and you communicate between the iframe and your main window only using postMessage() and addEventListener("message",...).
This works because postMessage() can be used even between different-origin windows. But it's also significantly more complicated because you have to pass everything through some kind of messaging infrastructure that you create between the iframe and the main window, rather than (for example) just using the IndexedDB API directly in your main window's code.

HTML-based storage (indexedDB, localStorage) in Chrome extensions behaves in a way that might not be expected, but it's perfectly natural.
In the background page, the domain is chrome-extension://yourextensionid/, and this is shared by all extension pages and is persistent.
In the content scripts though, you're sharing the HTML storage with the domain you're operating on. This makes life difficult if you want it to share/persist things. Note that sometimes this behavior is actually helpful.
The universal solution is to keep the DB in a background script, and communicate data/requests by means of Messaging API.
This was the usual solution for localStorage use until chrome.storage came along. But since you're using a database, you don't have a ready extension-friendly replacement.

Is there a way to mitigate downloading of resources (images/css and js files) with Javascript?

I have a html page on my localhost - get_description.html.
The snippet below is part of the code:
<input type="text" id="url"/>
<button id="get_description_button">Get description</button>
<iframe id="description_container" src="#"/>
When the button is clicked the src of the iframe is set to the url entered in the textbox. The pages fetched this way are very big with lots of linked files. What I am interested in the page is a block of text contained in a <div id="description"> element.
Is there a way to mitigate downloading of resources linked in the page that loads into the iframe?
I don't want to use curl because the data is only available to logged in users and the steps to take with curl to get the content is too complicated. The iframe is simple as I use this on a box which sends the right cookies to identify the request as coming from a logged in user, but the problem is that it is very wasteful to get nearly 1 MB of data to keep 1 KB of it and throw out the rest.
Edit
If the proposed method just works in Firefox it is fine, so I added Firefox tag. Also, it is possible that the answer actually is from the realm of Firefox add-on techniques, so I added that tag as well.
The problem is not that I cannot get at what I'm looking for, rather, the problem is the easy iframe method is wasteful.
I know that Firefox does allow loading only the text of a page. If you open a page and press Ctrl+U you are taken to 'view page source' window, There links behave as normal and are clickable, if you click on a link in source view, the source of the new page is loaded into the view source window, without the linked resources being downloaded, exactly what I'm trying to get. But I don't know how to access this behaviour.
Another example is the Adblock add-on. It somehow kills elements before they get loaded. With plain Javascript this is not possible. Because it only is triggered too late to intervene in good time.

The Same Origin Policy forbids any web page to access contents of any other web page in a different domain so basically you cannot do that.
However it seems that with some browsers it is allowed to access web pages content if you are trying to access it from a local web page which seems to be your case.
Safari, IE 6/7/8 are browser that allow a local web page to do so via XMLHttpRequest (source: Google Browser Security Handbook) so you may want to choose to use one of those browsers to do what you need (note that future versions of those browsers may not allow to do so anymore).
A part from this solution I only see two possibities:
If the web pages you need to fetch content from are somehow controlled by you, you can create a simpler interface to let other web pages to get the content you need (for example allowing JSONP requests).
If the web pages you need to fetch content from are not controlled by you the only solution I see is to fetch content server side logging in from the server directly (I know that you don't want to do so, but I don't see any other possibility if the previous I mentioned are not practicable)
Hope it helps.

Actually I've seen Cross Domain jQuery .load request before, here: http://james.padolsey.com/javascript/cross-domain-requests-with-jquery/
The author claims that codes like these found on that page
$('#container').load('http://google.com'); // SERIOUSLY!
$.ajax({
url: 'http://news.bbc.co.uk',
type: 'GET',
success: function(res) {
var headline = $(res.responseText).find('a.tsh').text();
alert(headline);
}
});
// Works with $.get too!
would work. (The BBC code might not work because of the recent redesign, but you get the idea)
Apparently it is using YQL wrapped into a jQuery plugin to do the trick. Now I cannot say I fully understand what he is doing there but it appears to work, and fits the bill. Once you load the data I suppose it is a simple matter of filtering out the data that you need.
If you prefer something that works at the browser level, may I suggest Mozilla's Jetpack framework for lightweight extensions. I've not yet read the documentations in its entirety but it should contain the APIs needed for this to work.

There are various ways to go about this in AJAX, I'm going to show the jQuery way for brevity as one option, though you could do this in vanilla JavaScript as well.
Instead of an <iframe> you can just use a container, let's say a <div> like this:
<div id="description_container"></div>
Then to load it:
$(function() {
$("#get_description_button").click(function() {
$("#description_container").load($("input").val() + " #description");
});
});
This uses the .load() method which takes a string in this format: .load("url selector"), then takes that element in the page and places it's content inside the container you're loading, in this case #description_container.
This is just the jQuery route, mainly to illustrate that yes, you can do what you want, but you don't have to do it exactly like this, just showing the concept is getting what you want from an AJAX request, rather than in an <iframe>.

Your description sounds like you are fetching pages from the same domain (you said that you need to be logged in and have session credentials) so have you tried to use async request via XMLHttpRequest? It might complain if the html on a page is particularly messed up but you chould still be able to get raw text via .responseText and extract what you need with a regex.

We Keep Coding

JavaScript is the programming language of the Web.