How to use 'row elements' of excel file(xlsx) in CasperJS? - javascript

I want to import excel file (it consists of one column and many rows of string ) to my JavaScript and I'll use these strings elements of .xlsx to searching text automatically through CasperJS
How can I import excel file and make all of elements take turns applied?
Here is my code and I want to put the elements of excel file to "something"
casper.start('http://thehomepage.com/');
// start at homepage
casper.then( function (){
this.sendKeys('#dicQuery','**something**');
// I want to put my elements iteratelly
console.log('entering text');
});
casper.thenClick(x('//*[#id="field"]/a'), function(){
console.log('click searching');
});
casper.then(function() {
words = this.evaluate(getWords);
});
function createFinal(wordArray) {
var out = [];
// remove duplicating START
var a = {};
for(var i=0; i <wordArray.length; i++){
if(typeof a[wordArray[i]] == "undefined")
a[wordArray[i]] = 1;
}
wordArray.length = 0;
for(i in a)
wordArray[wordArray.length] = i;
// remove duplicating END
wordArray.forEach(function(my_word) {
out.push({"moeum": "**something**", "word": my_word});
}); // I want to put my elements in it iteratelly
return out;
}

I don't think there is xlsx file reader in PhantomJS (and therefore CasperJS), but you can save your xlsx file as csv. Since it is a simple text file, then you can read it and build your sheet yourself.
For example:
var fs = require("fs");
var sheet = fs.read("data.csv")
.split("\n")
.map(function(row){
return row.split(";"); // or which even split character your have chosen for CSV
});
Then you can access it like this:
sheet[rowIndex][colIndex]

Related

I need a javascript that can extract specific text from a PDF

I work as legal support in litigation. I am not that clued up on scripting, but have managed to adapt a few google searches to perform various tasks for in Adobe.
What I need is help with what I think should be a simple script to read through a PDF and extract Document IDs. They are enclosed in square brackets, so I just need to extract all text between square brackets to a text or CSV file. I have tried using the ChatGPT bot but that hasnt been very successful. This is the code it has given me
// Open the PDF file
var filePath = "/path/to/your/file.pdf";
var doc = app.open(filePath);
// Get the number of pages
var numPages = doc.numPages;
// Create an array to hold the results
var results = [];
// Loop through each page and extract text between square brackets
for (var i = 0; i < numPages; i++) {
var page = doc.getPageNthWordQuads(i);
for (var j = 0; j < page.length; j++) {
var word = page[j];
var text = word[4];
// Check if the text is between square brackets
if (text.startsWith("[") && text.endsWith("]")) {
// Remove the brackets and add the text to the results array
results.push(text.slice(1, -1));
}
}
}
// Save the results to a text file
var outputPath = "/path/to/your/output/file.txt";
var outputFile = new File(outputPath);
outputFile.open("w");
outputFile.write(results.join("\n"));
outputFile.close();
// Close the PDF file
doc.close();
I ran the script, with my file directory, not the placeholder in the script, but nothing happened. No error or anything
I am using a work PC so I cant install python or any other program, hence the need for Java or possibly powershell if that will work
Can anyone help me?
Actually I realised i could do this using the Evermap plugin. Highlight text by pattern - [(.*?)], then extract highlighted text.
This can be achieved with pdf.js library: below example shows if specific text in the first page but can be furhter extended to check the whole pdf. Hope this helps!
// Load PDF.js library
const pdfjsLib = require('pdfjs-dist');
// Load PDF file
const url = 'path/to/pdf/file.pdf';
const loadingTask = pdfjsLib.getDocument(url);
loadingTask.promise.then(function(pdf) {
// Load the first page
pdf.getPage(1).then(function(page) {
// Get the text content of the page
page.getTextContent().then(function(textContent) {
// Iterate through each text item
for (let i = 0; i < textContent.items.length; i++) {
const item = textContent.items[i];
// Check if the text item matches your criteria
if (item.str.includes('specific text')) {
console.log(item.str);
}
}
});
});
});
You can get rid of the entire for loop and just use a regular expression with String.match():
const data = "This is the text ofa pdf file [documentid1] and [documentid2].";
const matches = data.match(/(?<=\[).*?(?=\])/gs);
console.log(matches);

Recursive traversal of unknown json object/tree and actions based on returned json

I'm using a API that returns JSON on request. This JSON has either names for next level URL's or a filename.
The problem is that code has to recognize which JSON is returned.
If JSON only has a names for next url levels then create url and get it.
Then recursively get a new set of names or files, recognize and do it over. It can go as menu levels deep as required. 1 to *
If it has a filename it should get it and render it as html.
(This is already solved)
Example of json
{id: 'New_url_1_level_1', id:'New_url_2_level_1', id:'New_url_3_level_1'}
//or
{id:'001200.file.ext',id:'001300.file.ext'...}
These would turn into http://my.api.call.com/New_url_1_level_1.../New_url1_level_2/...
The problem is that how to loop over URL's and to finally get to filename for example:
http://my.api.call.com/New_url_1_level_1/New_url_1_level_2/New_url_1_level_3/001300.file.ext
My current script is:
var json;
var urllevel= '/First_level';
var api = 'http://my.api.call.com';
var re = /^\d+/g; // Regex to match filename (decide if json has filenames or urls; files always start with digits or end with extension)
var loopApiUrl = new Array();
var recursion = false;
// This is the problem - how to recursively build url's based on returned data i.e. traverse a "unknown" tree
function recursePxJson(){
if (!recursion) {
loopApiUrl = [];
}
// Get JSON
$.get(api+urllevel+'/'+loopApiUrl.join('/'),function(data,status){
for (var i in data) {
if (!re.test(data[i].id)) { // {id: 'This_is_to_be_appended_to_url', id:'Another_appendable'}
recursion = true;
loopApiUrl.push(data[i].id);
recursePxJson();
}
else { // {id:'001200.file.ext',id:'001300.file.ext'}
load(api+urllevel+'/'+loopApiUrl.join('/')+'/'+data[i].id);
recursion = false;
}
}
});
//loadDBS(param);
}
// Load renderable JSON - ALREADY SOLVED
function load(param){
$.get(param, function(data, status){
json = JSON.stringify(data);
var title = data.title.split(':');
html = '<h2>'+title[0]+'</h2>';
html += '<h3>'+title[1]+'</h3>';
html += '<h5>Values:</h5>';
for (var i=0; i<data.variables.length; i++) {
html += '<b>'+data.variables[i].text+': </b>';
varlen = data.variables[i].valueTexts.length;
if (varlen > 6) {
html += '<i>'+data.variables[i].valueTexts[0]+', '+data.variables[i].valueTexts[1]+', '+data.variables[i].valueTexts[2]+' . . . '+data.variables[i].valueTexts[varlen-3]+', '+data.variables[i].valueTexts[varlen-2]+', '+data.variables[i].valueTexts[varlen-1]+'</i>'+'<b> (yhteensä '+varlen+' arvoa)</b>';
} else {
html += '<i>'+data.variables[i].valueTexts.join(',')+'</i>';
}
html += '<br/>';
}
$(html+'<br>').appendTo($('#tab2'));
});
}
EDIT: At the moment it seems like it is does each for loop before it begins another. Therefore it starts one in loop and if another is instatiated it won't be run before the fist one is done.
Main loop
Internal Loop 1
Internal Loop 2 <- Isn't this the one that should done first?
Handle your loopApiUrl variable as a parameter for your function recursePxJson().
Get rid of the useless recursion boolean.
You may find it easier to ditch jQuery and make use of a plain old XMLHTTPRequest. Your code will be slightly longer but you'll gain a better control of what your doing.

Read variables from a file in Javascript

I have a variable called words in Javascript like this:
var words = [{"text":"This", "url":"http://google.com/"},
{"text":"is", "url":"http://bing.com/"},
{"text":"some", "url":"http://somewhere.com/"},
{"text":"random", "url":"http://random.org/"},
{"text":"text", "url":"http://text.com/"},
{"text":"InCoMobi", "url":"http://incomobi.com/"},
{"text":"Yahoo", "url":"http://yahoo.com/"},
{"text":"Minutify", "url":"http://minutify.com/"}]
and I use the variable elements as for example words[0].url which points to the first url, i.e http://google.com/, etc.
If I store the data in a file like this (I call it file.csv):
This, http://google.com/
is, http://bing.com/
some, http://somewhere.com/
random, http://random.org/
text, http://text.com/
InCoMobi, http://incomobi.com/
Yahoo, http://yahoo.com/
Minutify, http://minutify.com/
How can I read the file in Javascrip and re-create variable words, with the exact same format as I mentioned earlier, i.e re-create:
var words = [{"text":"This", "url":"http://google.com/"},
{"text":"is", "url":"http://bing.com/"},
{"text":"some", "url":"http://somewhere.com/"},
{"text":"random", "url":"http://random.org/"},
{"text":"text", "url":"http://text.com/"},
{"text":"InCoMobi", "url":"http://incomobi.com/"},
{"text":"Yahoo", "url":"http://yahoo.com/"},
{"text":"Minutify", "url":"http://minutify.com/"}]
It looks like there are two steps. First is to get the external file, and the next step is to get it into a format you want it.
If you're not using jquery, first step is:
var file = new XMLHttpRequest();
file.onload = function() {
alert(file.responseText);
}
file.open('GET', 'file.csv');
file.send();
Next step is to take that file.responseText and format it. I might do:
var file = new XMLHttpRequest();
var words = [];
file.onload = function() {
var lines = file.responseText.split("\n");
for (var i = 0; i < lines.length; i++) {
var word = {};
var attributes = lines[i].split(",");
word.text = attributes[0];
word.url = attributes[1];
words.push(word);
}
}
file.open('GET', 'file.csv');
file.send();
If you're using a JSON file, just change the function above to be:
file.onload = function() {
words = JSON.parse(file.responseText);
}
Keep in mind that the words variable will not be available until the onload function runs, so you should probably send it to another function that uses it.
You could use the fetch API, it has many advantages and one of them is very short syntax, unlike the XMLHttpRequest constructor.
fetch("object.json").then(function(data){window.data=data.json()});
//then access the data via [window.data]

Return a specific line from a csv file using the data.split function in jquery

I am using the following code to get contents from a csv file
$.get(file , function(data) {
var lines = data.split('\n');
$.each(lines, function (lineNo, line) {
var items = line.split(',');
// MORE STUFF
});
});
The above code gives me all the lines that are available in my csv file. Here is an example of the data returned
one,0,0,
two,0,0
three,0,0
What i would like is to retrieve only a specific line from the file. for example "two,0,0"
How do i acheive this ?
Thanks
Once you split a string, the result is a numeric array with each portion of the string. So all of your lines are now numbered like any other numeric array. If you wanted the second line, you would just use the key to ask for it:
var lines = data.split("\n");
var secondLine = lines[1];

Javascript ignoring all files as opposed to one

I'm trying to get my javascript to ignore one file type extension that's held in a folder with a bunch of photoshop images. For all of the other file types in the folder I have it so that these file types populate a window and the user can import into their work space.
I have modified my script to ignore the file extension I want ignored, however it no longer populates the window with all of the other file types containted in the folder. But when I take out the file I want ignore from the folder, the window gets populated as it should.
This is what I have at the moment that checks my folder for the file types:
//Prompt for folder location
var Path = Folder.selectDialog("Select Folder Location for Renders")
// Use the path to the application and append the samples folder
var samplesFolder = Folder(Path)
//Get the files
var fileList = samplesFolder.getFiles()
//Creat Array to hold names
var renderTypes = new Array();
//Parse Initial name to get similar render elements
var beautyRender = fileList[0].name
beautyRender = beautyRender.substr(0, beautyRender.length-4)
//Get the render elements with a similar name
for (var i = 0; i < fileList.length; i++)
{
var filename = fileList[i].name;
if (filename.match(/\.(stitchinfo)$/i) == null)
{
if(fileList[i].name.substring(0,beautyRender.length) === beautyRender)
{
renderTypes.push(fileList[i].name);
}
}
}
Can anyone see what I've done wrong and need to modify?
Update
I'm still trying to get this to work and following the help from one of the posters below I have modified my code to the following:
for (var i = 0; i < fileList.length; i++)
{
var filename = fileList[i].name;
if (filename.match(/\.(stitchinfo)$/i) == null)
{
renderTypes.push(fileList[i].name);
}
}
However, with this new code comes a new problem in that it returns every file contained in the folders and displays it.
I'm still stumped as to how I can get this to work as I would like. Please can anyone help me?
You're creating a sparse array, because you skip elements in renderTypes when you ignore a filename in fileList. That may be confusing your rendering code. Change to:
renderTypes.push(fileList[i].name);
What if :
for (var i = 0; i < fileList.length; i++)
{
var filename = fileList[i].name;
if (filename.match(/\.(stitchinfo)$/i) == null)
{
if(fileList[i].name.substring(0,beautyRender.length) === beautyRender)
{
renderTypes.push(fileList[i].name);
}
}
}
Wrong usage of the array
Missing ";"
Unnecessary use of "continue".
Managed to get a fix for this.
What I've ended up doing is creating a new function and passing that into my call for checking the folder locations.
function ignoreMe(f)
{
return f.name.match(/\.(stitchinfo)$/i) == null;
}
And the folder check:
var fileList = samplesFolder.getFiles(ignoreMe);

Categories