Javascript ignoring all files as opposed to one - javascript

I'm trying to get my javascript to ignore one file type extension that's held in a folder with a bunch of photoshop images. For all of the other file types in the folder I have it so that these file types populate a window and the user can import into their work space.
I have modified my script to ignore the file extension I want ignored, however it no longer populates the window with all of the other file types containted in the folder. But when I take out the file I want ignore from the folder, the window gets populated as it should.
This is what I have at the moment that checks my folder for the file types:
//Prompt for folder location
var Path = Folder.selectDialog("Select Folder Location for Renders")
// Use the path to the application and append the samples folder
var samplesFolder = Folder(Path)
//Get the files
var fileList = samplesFolder.getFiles()
//Creat Array to hold names
var renderTypes = new Array();
//Parse Initial name to get similar render elements
var beautyRender = fileList[0].name
beautyRender = beautyRender.substr(0, beautyRender.length-4)
//Get the render elements with a similar name
for (var i = 0; i < fileList.length; i++)
{
var filename = fileList[i].name;
if (filename.match(/\.(stitchinfo)$/i) == null)
{
if(fileList[i].name.substring(0,beautyRender.length) === beautyRender)
{
renderTypes.push(fileList[i].name);
}
}
}
Can anyone see what I've done wrong and need to modify?
Update
I'm still trying to get this to work and following the help from one of the posters below I have modified my code to the following:
for (var i = 0; i < fileList.length; i++)
{
var filename = fileList[i].name;
if (filename.match(/\.(stitchinfo)$/i) == null)
{
renderTypes.push(fileList[i].name);
}
}
However, with this new code comes a new problem in that it returns every file contained in the folders and displays it.
I'm still stumped as to how I can get this to work as I would like. Please can anyone help me?

You're creating a sparse array, because you skip elements in renderTypes when you ignore a filename in fileList. That may be confusing your rendering code. Change to:
renderTypes.push(fileList[i].name);

What if :
for (var i = 0; i < fileList.length; i++)
{
var filename = fileList[i].name;
if (filename.match(/\.(stitchinfo)$/i) == null)
{
if(fileList[i].name.substring(0,beautyRender.length) === beautyRender)
{
renderTypes.push(fileList[i].name);
}
}
}
Wrong usage of the array
Missing ";"
Unnecessary use of "continue".

Managed to get a fix for this.
What I've ended up doing is creating a new function and passing that into my call for checking the folder locations.
function ignoreMe(f)
{
return f.name.match(/\.(stitchinfo)$/i) == null;
}
And the folder check:
var fileList = samplesFolder.getFiles(ignoreMe);

Related

I need a javascript that can extract specific text from a PDF

I work as legal support in litigation. I am not that clued up on scripting, but have managed to adapt a few google searches to perform various tasks for in Adobe.
What I need is help with what I think should be a simple script to read through a PDF and extract Document IDs. They are enclosed in square brackets, so I just need to extract all text between square brackets to a text or CSV file. I have tried using the ChatGPT bot but that hasnt been very successful. This is the code it has given me
// Open the PDF file
var filePath = "/path/to/your/file.pdf";
var doc = app.open(filePath);
// Get the number of pages
var numPages = doc.numPages;
// Create an array to hold the results
var results = [];
// Loop through each page and extract text between square brackets
for (var i = 0; i < numPages; i++) {
var page = doc.getPageNthWordQuads(i);
for (var j = 0; j < page.length; j++) {
var word = page[j];
var text = word[4];
// Check if the text is between square brackets
if (text.startsWith("[") && text.endsWith("]")) {
// Remove the brackets and add the text to the results array
results.push(text.slice(1, -1));
}
}
}
// Save the results to a text file
var outputPath = "/path/to/your/output/file.txt";
var outputFile = new File(outputPath);
outputFile.open("w");
outputFile.write(results.join("\n"));
outputFile.close();
// Close the PDF file
doc.close();
I ran the script, with my file directory, not the placeholder in the script, but nothing happened. No error or anything
I am using a work PC so I cant install python or any other program, hence the need for Java or possibly powershell if that will work
Can anyone help me?
Actually I realised i could do this using the Evermap plugin. Highlight text by pattern - [(.*?)], then extract highlighted text.
This can be achieved with pdf.js library: below example shows if specific text in the first page but can be furhter extended to check the whole pdf. Hope this helps!
// Load PDF.js library
const pdfjsLib = require('pdfjs-dist');
// Load PDF file
const url = 'path/to/pdf/file.pdf';
const loadingTask = pdfjsLib.getDocument(url);
loadingTask.promise.then(function(pdf) {
// Load the first page
pdf.getPage(1).then(function(page) {
// Get the text content of the page
page.getTextContent().then(function(textContent) {
// Iterate through each text item
for (let i = 0; i < textContent.items.length; i++) {
const item = textContent.items[i];
// Check if the text item matches your criteria
if (item.str.includes('specific text')) {
console.log(item.str);
}
}
});
});
});
You can get rid of the entire for loop and just use a regular expression with String.match():
const data = "This is the text ofa pdf file [documentid1] and [documentid2].";
const matches = data.match(/(?<=\[).*?(?=\])/gs);
console.log(matches);

Why won't the makeCopy portion of my script execute properly in Google Apps Script?

I am a very novice coder and am trying to accomplish the following using a Google Form:
Rename file uploaded by user based on name defined by combination of form fields
Create a copy of the uploaded file to a specific folder in GDrive, based on answer to particular form question
So far, I have managed to get Part 1 working, but Part 2 doesn't seem to function properly (no error message, just no action). Anyone able to guide me where I'm going wrong?
function fileRename() {
var form = FormApp.getActiveForm()
// returns the total number of form submissions
var length=form.getResponses().length;
//retrieve fileID of document uploaded by user in Question 6 of the form (i.e. Index 5)
var id=form.getResponses()[length-1].getItemResponses()[5].getResponse();
//getResponses()[length-1] retrieves the last form response, accounting for the fact that the first index is zero and hte last length-1
//gets the form answers used to concatenate the file name
var fileUploadEntity=form.getResponses()[length-1].getItemResponses()[0].getResponse();
var fileUploadDate=form.getResponses()[length-1].getItemResponses()[3].getResponse();
var fileUploadType=form.getResponses()[length-1].getItemResponses()[1].getResponse();
//accesses the uploaded file
var file=DriveApp.getFileById(id);
var name = file.getName();
//changes the file name
var name = fileUploadEntity+'_'+fileUploadDate+'_'+fileUploadType
file.setName(name);
//creates a copy and saves it to the relevant regional shared drive depending on which array the entity belongs to, using its four-letter identifier
var APAC = ["WRAU", "WRNZ", "WRSG", "WRMY", "WRHK"];
var NORAM = ["WRCA", "WRCC", "WRCW", "WRUS"];
var MEA = ["WRKE", "WRUG", "WRSO", "WRSA", "WRRW", "WRTZ", "WRZW"];
var LATAM = ["WRMX"];
var EEA = ["WRBE", "WRUK"];
var folderAPAC = DriveApp.getFolderById('1IKIDSEEGHf802WaF4l4ntN9uiUO5jJpa');
var folderNORAM = DriveApp.getFolderById('1BitldN3Uw7453wxnnI1X5PUmbmTiQn5O');
var folderMEA = DriveApp.getFolderById('18tWR1C-mdO7moAtktOHJsvXjx_V0kdg0');
var folderLATAM = DriveApp.getFolderById('1cG0iPocn3KyXK8XgaxnZNWVU-HKJ97dX');
var folderEEA = DriveApp.getFolderById('1N8tB8AjMkR7gRarcwd4NYmry_wh0WVkY');
if (fileUploadEntity.indexOf(APAC)>-1) {
file.makeCopy(name, folderAPAC);
}
else if (fileUploadEntity.indexOf(NORAM)>-1) {
file.makeCopy(name, folderNORAM);
}
else if (fileUploadEntity.indexOf(LATAM)>-1) {
file.makeCopy(name, folderLATAM);
}
else if (fileUploadEntity.indexOf(MEA)>-1) {
file.makeCopy(name, folderMEA);
}
else if (fileUploadEntity.indexOf(EEA)>-1) {
file.makeCopy(name, folderEEA);
}
}
You code is using indexOf the wrong way.
Instead of
fileUploadEntity.indexOf(APAC)
try
APAC.indexOf(fileUploadEntity)
Do the same or the other places where indexOf is used
Reference
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/indexOf

How do I pass a whilte list to the caja web standalone script

I'm using http://caja.appspot.com/html-css-sanitizer-minified.js to sanitize user html, however in some instances I want to restrict the tags used to just a white list.
I've found https://code.google.com/p/google-caja/wiki/CajaWhitelists which describes how to define a white list, but I can't work out how to pass it to the html_sanitize method provided by html-css-sanitizer-minified.js
I've tried calling html.sanitizeWithPolicy(the_html, white_list); but I get an error:
TypeError: a is not a function
Which is hard to debug due to the minification, but it seems likely that html-css-sanitizer-minified.js does not contain everything in the html-sanitizer.js file.
I've tried using html-sanitizer.js combined with cssparser.js instead of the minified version, but I get errors before calling it, presumably because I am missing other dependencies.
How can I make this work?
Edit: sanitizeWithPolicy does exist in the minified file, but something is missing further down the process. This suggests that this file can't be used with a custom white list. I'm now investigating if it is possible to work out which uniminified files I need to include to make my own version.
Edit2: I was missing two files https://code.google.com/p/google-caja/source/browse/trunk/src/com/google/caja/plugin/html4-defs.js?spec=svn1950&r=1950 and https://code.google.com/p/google-caja/source/browse/trunk/src/com/google/caja/plugin/uri.js?r=5170
However I am now getting an error because sanitizeWithPolicy expects a function not a whitelist object. Also the html4-defs.js file is very old and according to this I would have to build the caja project in order get a more recent one.
I solved this by downloading the unminified files
https://code.google.com/p/google-caja/source/browse/trunk/src/com/google/caja/plugin/html-sanitizer.js
https://code.google.com/p/google-caja/source/browse/trunk/src/com/google/caja/plugin/uri.js
https://code.google.com/p/google-caja/source/browse/trunk/src/com/google/caja/plugin/html4-defs.js?spec=svn1950&r=1950
(This last one is from an old revision. This file is built from the Java files, would be great if a more up to date one was available.)
I then added a new function to html-sanitizer.js
/**
* Trims down the element white list to just those passed in whilst still not allowing unsafe elements.
* #param {array} custom_elements An array of elements to include.
*/
function useCustomElements(custom_elements) {
var length = custom_elements.length;
var new_elements = {};
for (var i = 0; i < length; i++) {
var key = custom_elements[i].toLowerCase();
if (typeof elements.ELEMENTS[key] !== 'undefined') {
new_elements[key] = elements.ELEMENTS[key];
}
}
elements.ELEMENTS = new_elements;
};
I then made this function public with this near the end of the file withthe other public function statements.
html.useCustomElements = html['useCustomElements'] = useCustomElements;
Now I can call it like so:
var raw = '<p>This element is kept</p><div>this element is not</div>';
var white_list ['p', 'b'];
html.useCustomElements(white_list)
var sanitized = html.sanitize(raw);
I then manually added some html5 elements to the html4-defs.js file (The ones that just define block elements like and ).
The attributes sanitization was still broken. This is due to the html4-defs.js file being out of date with the html-sanitizer.js. I changed this in html-sanitizer.js :
if ((attribKey = tagName + '::' + attribName,
elements.ATTRIBS.hasOwnProperty(attribKey)) ||
(attribKey = '*::' + attribName,
elements.ATTRIBS.hasOwnProperty(attribKey))) {
atype = elements.ATTRIBS[attribKey];
}
to
if (elements.ATTRIBS.hasOwnProperty(attribName)) {
atype = elements.ATTRIBS[attribName];
}
This is far from ideal but without compiling Caja and generating an up to date html-defs.js file I can't see a way around this.
This still leaves css sanitization. I would like this as well, but I am missing the css def files and can't find any that work via search so I have turned it off for now.
EDIT: I've managed to extract the html-defs from html-css-sanitizer-minified.js.
I've uploaded a copy to here. It includes elements like 'nav' so it has been updated for html5.
I've tried to do the same for the css parsing, I managed to extract the defs, but they depend on a bit count, and I can't find anyway to calculate what bits were used for which defaults.
I've decided on another approach. I've left the other answer in case I manage to find the bit values for the css definitions as it would be preferable to this one if I could get it to work.
This time I've taken the html-css-sanitizer-minified file and injected a bit of code into it so that the element and attributes can be modified.
Search for :
ka=/^(?:https?|mailto)$/i,m={};
And after it insert the following:
var unmodified_elements = {};
for(var property_name in $.ELEMENTS) {
unmodified_elements[property_name] = $.ELEMENTS[property_name];
};
var unmodified_attributes = {};
for(var property_name in $.ATTRIBS) {
unmodified_attributes[property_name] = $.ATTRIBS[property_name];
};
var resetElements = function () {
$.ELEMENTS = {};
for(var property_name in unmodified_elements) {
$.ELEMENTS[property_name] = unmodified_elements[property_name];
}
$.f = $.ELEMENTS;
};
var resetAttributes = function () {
$.ATTRIBS = {};
for(var property_name in unmodified_attributes) {
$.ATTRIBS[property_name] = unmodified_attributes[property_name];
}
$.m = $.ATTRIBS;
};
var resetWhiteLists = function () {
resetElements();
resetAttributes();
};
/**
* Trims down the element white list to just those passed in whilst still not allowing unsafe elements.
* #param {array} custom_elements An array of elements to include.
*/
var applyElementsWhiteList = function(custom_elements) {
resetElements();
var length = custom_elements.length;
var new_elements = {};
for (var i = 0; i < length; i++) {
var key = custom_elements[i].toLowerCase();
if (typeof $.ELEMENTS[key] !== 'undefined') {
new_elements[key] = $.ELEMENTS[key];
}
}
$.f = new_elements;
$.ELEMENTS = new_elements;
};
/**
* Trims down the attribute white list to just those passed in whilst still not allowing unsafe elements.
* #param {array} custom_attributes An array of attributes to include.
*/
var applyAttributesWhiteList = function(custom_attributes) {
resetAttributes();
var length = custom_attributes.length;
var new_attributes = {};
for (var i = 0; i < length; i++) {
var key = custom_attributes[i].toLowerCase();
if (typeof $.ATTRIBS[key] !== 'undefined') {
new_attributes[key] = $.ATTRIBS[key];
}
}
$.m = new_attributes;
$.ATTRIBS = new_attributes;
};
m.applyElementsWhiteList = applyElementsWhiteList;
m.applyAttributesWhiteList = applyAttributesWhiteList;
m.resetWhiteLists = resetWhiteLists;
You can now apply a white list with :
var raw = "<a>element tags removed</a><p class='class-removed' style='color:black'>the p tag is kept</p>";
var tag_white_list = [
'p'
];
var attribute_white_list = [
'*::style'
];
html.applyElementsWhiteList(tag_white_list);
html.applyAttributesWhiteList(attribute_white_list);
var san = html.sanitize(raw);
This approach also sanatizes the styles, which I needed. Another white list could be injected for those, but I don't need that so I havn't written one.

How can I remove the parent folders in an array of paths?

I am using the following code in order to retrieve sub folder names from the path declared. This works fine but how do I then remove the path name so that the array is a list of just folder names?
var myPath = Folder ("Z:/My File System/Me/Work Files/Design");
var folders = getFolders (myPath);
function getFolders(sourceFolder) {
var folderArray = new Array();
var sFolders = sourceFolder.getFiles ();
var len = sFolders.length;
for (var i = 0; i < len; i++) {
var sFolder = sFolders[i];
if (sFolder instanceof Folder) {
folderArray.push(sFolder);
}
}
return folderArray;
}
Instead of returning:
Z:/My File System/Me/Work Files/Design/One
Z:/My File System/Me/Work Files/Design/Two
Z:/My File System/Me/Work Files/Design/Three
Z:/My File System/Me/Work Files/Design/Four
I need:
One
Two
Three
Four
You could implement something like this (assuming you can modify the Folder prototype, and it stores the path as this.path):
Folder.prototype.basename = function () {
return this.path.split('/').pop();
};
You would then append the base names to the array:
folderArray.push(sFolder.basename());
You could use split() like this, assuming there are no other slashes towards the end.
var sample = 'Z:/My File System/Me/Work Files/Design/Four'.split('/')
var result = sample[sample.length - 1]
Regexp the string. Start at the end, and work backward until the first '/'

Can't get Content of the folder using Skydrive API

I have real problem in displaying the content of the folders which are located inside the root directory. It managed to determine the folders which are in the Files directory but when I try to do the same to one of those folders It doesn't work.
I delieve the problem in the path name of a WL.api. However I may be mistaken.
I used code samples from skydrive page of live connect development center. in the sample below I tried to determine folders first, but eventually I would like to get the names of all files stored in a particular directory.
WL.api({ path: "me/skydrive/files/myfolder", method: "get" }).then(
function (response) {
var items = response.data;
var outPuts = "";
var number = items.length
var tempos = new Array();
var foundFolder = 0;
for (var i = 0; i < items.length; i++) {
if (items[i].type === "folder" || items[i].type === "album") {
tempos[i] = items[i].name;
foundFolder += 1;
}
}
if (foundFolder == 0) {
folderss.innerHTML = ("Unable to find any folders");
}
else {
for (var i = 0; i < number; i++) {
outPuts = outPuts + tempos[i] + "<br /> <br />"
}
folderss.innerHTML = outPuts;
}
}
);
if I retain only "me/skydrive/files" for WL path. it works. But if I add any particular folder name afer it like in my case "me/skydrive/files/myfolder" the call returns nothing. or may be I shall declare a path like: "me/skydrive/files/folder.567391047.34282821!"
Thank you for anyone who can help.
I believe your problem is due to the fact that you are using an invalid path format. According to the examples from the docs, a valid path to list files has the following form: /OBJECT_ID/files, where OBJECT_ID may be replaced by me/skydrive to reference the Skydrive root folder.
The important things to note are that:
there can be a reference (OBJECT_ID) to only one object;
this reference can only be the ID of an object (as returned by the API) or a special alias such as me/skydrive;
/files should always be the last part of the path (assuming we do not need to use a query string).
Thus, to list the contents of your subfolder folder.567391047.34282821!, you should try using the following path format instead:
/folder.567391047.34282821!/files or even folder.567391047.34282821!/files (without the leading slash, as it seems to be optional).
Please let me know if this solves your issue.

Categories