File metadata is not getting updated in Firestore Storage - javascript

I have created a Cloud Function that trigger on any new file upload in Firebase Storage. Once successful upload function will update its metadata, but even though setting new metadata with 'setMetadata()' is not getting applied. There is no error during the process and but on checking for updated metadata, the new one is not reflecting.
exports.onImageUpload = functions.storage.object().onFinalize(async (object) => {
const storageRef = admin.storage().bucket(object.bucket);
var metadata = {
'uploader': 'unknown'
}
await storageRef.file(object.name).setMetadata(metadata).then(function(data) {
console.log('Success');
console.log(data);
return;
}).catch(function(error) {
console.log(error);
return ;
});
return;
});
There is no error, and on Cloud Function log its printing 'Success' message. Also "metageneration: '2'" property also got updated, which means it should have updated metadata with new values, but it didn't.

The problem comes from the fact that if you want to set custom key/value pairs they must be in the metadata key of the object you pass to the setMetadata() method, i.e. the metadata object in your case. This is explained in the API Reference Documentation for node.js.
So the following will work:
exports.onImageUpload = functions.storage.object().onFinalize(async (object) => {
const storageRef = admin.storage().bucket(object.bucket);
var metadata = {
metadata: {
'uploader': 'unknown'
}
}
try {
const setFileMetadataResponse = await storageRef.file(object.name).setMetadata(metadata);
console.log('Success');
console.log(setFileMetadataResponse[0]);
return null;
} catch (error) {
console.log(error);
return null;
}
});

Related

Reading Parquet objects in AWS S3 from node.js

I need to load and interpret Parquet files from an S3 bucket using node.js. I've already tried parquetjs-lite and other npm libraries I could find, but none of them seems to interpret date-time fields correctly. So I'm trying to AWS's own SDK instead, in the believe that is should be able to deserialize its own Parquet format correctly -- the objects were originally written from SageMaker.
The way to go about it, apparently, is to use the JS version of
https://docs.aws.amazon.com/AmazonS3/latest/API/API_SelectObjectContent.html
but the documentation for that is horrifically out of date (it's referring to the 2006 API, https://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/S3.html#selectObjectContent-property). Likewise, the example they show in their blog post doesn't work either (data.Payload is neither a ReadableStream not iterable).
I've already tried the response in
Javascript - Read parquet data (with snappy compression) from AWS s3 bucket. Neither of them work: the first uses
node-parquet, which doesn't currently compile, and the second uses parquetjs-lite (which doesn't work, see above).
So my question is, how is SelectObjectContent supposed to work nowadays, i.e., using aws-sdk v3?
import { S3Client, ListBucketsCommand, GetObjectCommand,
SelectObjectContentCommand } from "#aws-sdk/client-s3";
const REGION = "us-west-2";
const s3Client = new S3Client({ region: REGION });
const params = {
Bucket: "my-bucket-name",
Key: "mykey",
ExpressionType: 'SQL',
Expression: 'SELECT created_at FROM S3Object',
InputSerialization: {
Parquet: {}
},
OutputSerialization: {
CSV: {}
}
};
const run = async () => {
try {
const data = await s3Client.send(new SelectObjectContentCommand(params));
console.log("Success", data);
const events = data.Payload;
const eventStream = data.Payload;
// Read events as they are available
eventStream.on('data', (event) => { // <--- This fails
if (event.Records) {
// event.Records.Payload is a buffer containing
// a single record, partial records, or multiple records
process.stdout.write(event.Records.Payload.toString());
} else if (event.Stats) {
console.log(`Processed ${event.Stats.Details.BytesProcessed} bytes`);
} else if (event.End) {
console.log('SelectObjectContent completed');
}
});
// Handle errors encountered during the API call
eventStream.on('error', (err) => {
switch (err.name) {
// Check against specific error codes that need custom handling
}
});
eventStream.on('end', () => {
// Finished receiving events from S3
});
} catch (err) {
console.log("Error", err);
}
};
run();
The console.log shows data.Payload as:
Payload: {
[Symbol(Symbol.asyncIterator)]: [AsyncGeneratorFunction: [Symbol.asyncIterator]]
}
what should I do with that?
I was stuck on this exact same issue for quite some time. It looks like the best option now is to append a promise() to it.
So far, I've made progress using the following (sorry, this is incomplete but should at least enable you to read data):
try {
const s3Data = await s3.selectObjectContent(params3).promise();
// using 'any' here temporarily, but will need to address type issues
const events: any = s3Data.Payload;
for await (const event of events) {
try {
if(event?.Records) {
if (event?.Records?.Payload) {
const record = decodeURIComponent(event.Records.Payload.toString().replace(/\+|\t/g, ' '));
records.push(record);
} else {
console.log('skipped event, payload: ', event?.Records?.Payload);
}
}
else if (event.Stats) {
console.log(`Processed ${event.Stats.Details.BytesProcessed} bytes`);
} else if (event.End) {
console.log('SelectObjectContent completed');
}
}
catch (err) {
if (err instanceof TypeError) {
console.log('error in events: ', err);
throw err;
}
}
}
}
catch (err) {
console.log('error fetching data: ', err);
throw err;
}
console.log("final records: ", records);
return records;
}

How do I use JavaScript to call the AWS Textract service to upload a local photo for identification (without S3)

I want to call the AWS Textract service to identify the numbers in a local photo in JavaScript(without S3) and I get an error
TypeError:Cannot read property 'byteLength' of undefined ': Error in' Client.send (command)
I tried to find the correct sample in the AWS SDK for JavaScript V3 official documentation but couldn't find it.
I want to know how do I modify the code to call this service
This is my code
const {
TextractClient,
AnalyzeDocumentCommand
} = require("#aws-sdk/client-textract");
// Set the AWS region
const REGION = "us-east-2"; // The AWS Region. For example, "us-east-1".
var fs = require("fs");
var res;
var imagedata = fs.readFileSync('./1.png')
res = imagedata.toString('base64')
console.log("res2")
console.log(typeof(res))
// console.log(res)
const client = new TextractClient({ region: REGION });
const params = {
Document : {
Bytes: res
}
}
console.log("params")
console.log(typeof(params))
// console.log(params)
const command = new AnalyzeDocumentCommand(params);
console.log("command")
console.log(typeof(command))
const run = async () => {
// async/await.
try {
const data = await client.send(command);
console.log(data)
// process data.
} catch (error) {
console.log("Error");
console.log(error)
// error handling.
} finally {
// finally.
}
};
run()

Making a distinction between file not present and access denied while accessing s3 object via Javascript

I have inherited the following code. This is part of CICD pipeline. It tries to get an object called "changes" from a bucket and does something with it. If it is able to grab the object, it sends a success message back to pipeline. If it fails to grab the file for whatever reason, it sends a failure message back to codepipeline.
This "changes" file is made in previous step of the codepipeline. However, sometimes it is valid for this file NOT to exist (i.e. when there IS no change).
Currently, the following code makes no distinction if file simply does not exist OR some reason code failed to get it (access denied etc.)
Desired:
I would like to send a success message back to codepipeline if file is simply not there.
If there is access issue , then the current outcome of "failure' would still be valid.
Any help is greatly appreciated. Unfortunately I am not good enough with Javascript to have any ideas to try.
RELEVANT PARTS OF THE CODE
const AWS = require("aws-sdk");
const s3 = new AWS.S3();
const lambda = new AWS.Lambda();
const codePipeline = new AWS.CodePipeline();
// GET THESE FROM ENV Variables
const {
API_SOURCE_S3_BUCKET: s3Bucket,
ENV: env
} = process.env;
const jobSuccess = (CodePipeline, params) => {
return new Promise((resolve, reject) => {
CodePipeline.putJobSuccessResult(params, (err, data) => {
if (err) { reject(err); }
else { resolve(data); }
});
});
};
const jobFailure = (CodePipeline, params) => {
return new Promise((resolve, reject) => {
CodePipeline.putJobFailureResult(params, (err, data) => {
if (err) { reject(err); }
else { resolve(data); }
});
});
};
// MAIN CALLER FUNCTION. STARTING POINT
exports.handler = async (event, context, callback) => {
try {
// WHAT IS IN changes file in S3
let changesFile = await getObject(s3, s3Bucket, `lambda/${version}/changes`);
let changes = changesFile.trim().split("\n");
console.log("List of Changes");
console.log(changes);
let params = { jobId };
let jobSuccessResponse = await jobSuccess(codePipeline, params);
context.succeed("Job Success");
}
catch (exception) {
let message = "Job Failure (General)";
let failureParams = {
jobId,
failureDetails: {
message: JSON.stringify(message),
type: "JobFailed",
externalExecutionId: context.invokeid
}
};
let jobFailureResponse = await jobFailure(codePipeline, failureParams);
console.log(message, exception);
context.fail(`${message}: ${exception}`);
}
};
S3 should return an error code in the exception:
The ones you care about are below:
AccessDenied - Access Denied
NoSuchKey - The specified key does not exist.
So in your catch block you should be able to validate exception.code to check if it matches one of these 2.

Google cloud function bigquery json insert TypeError: job.promise is not a function

I'm replicating this Google authored tutorial and I have run into a problem and error that I can't figure out how to resolve.
On the Google Cloud Function import json to bigquery, I get an error " TypeError: job.promise is not a function "
Which is located towards the bottom of the function, the code in question is:
.then(([job]) => job.promise())
The error led me to this discussion about the API used, but I don't understand how to resolve the error.
I tried .then(([ job ]) => waitJobFinish(job)) and removing the line resolves the error but doesn't insert anything.
Tertiary question: I also can't find documentation on how to trigger a test of the function so that I can read my console.logs in the google cloud function console, which would help to figure this out . I can test the json POST part of this function, but I can't find what json to trigger a test of a new file write to cloud storage - the test says must include a bucket but I don't know what json to format (the json I use to test the post -> store to cloud storage doesn't work)
Here is the full function which I've pulled into it's own function:
(function () {
'use strict';
// Get a reference to the Cloud Storage component
const storage = require('#google-cloud/storage')();
// Get a reference to the BigQuery component
const bigquery = require('#google-cloud/bigquery')();
function getTable () {
const dataset = bigquery.dataset("iterableToBigquery");
return dataset.get({ autoCreate: true })
.then(([dataset]) => dataset.table("iterableToBigquery").get({ autoCreate: true }));
}
//set trigger for new files to google storage bucket
exports.iterableToBigquery = (event) => {
const file = event.data;
if (file.resourceState === 'not_exists') {
// This was a deletion event, we don't want to process this
return;
}
return Promise.resolve()
.then(() => {
if (!file.bucket) {
throw new Error('Bucket not provided. Make sure you have a "bucket" property in your request');
} else if (!file.name) {
throw new Error('Filename not provided. Make sure you have a "name" property in your request');
}
return getTable();
})
.then(([table]) => {
const fileObj = storage.bucket(file.bucket).file(file.name);
console.log(`Starting job for ${file.name}`);
const metadata = {
autodetect: true,
sourceFormat: 'NEWLINE_DELIMITED_JSON'
};
return table.import(fileObj, metadata);
})
.then(([job]) => job.promise())
//.then(([ job ]) => waitJobFinish(job))
.then(() => console.log(`Job complete for ${file.name}`))
.catch((err) => {
console.log(`Job failed for ${file.name}`);
return Promise.reject(err);
});
};
}());
So I couldn't figure out how to fix google's example, but I was able to get this load from js to work with the following code in google cloud function:
'use strict';
/*jshint esversion: 6 */
// Get a reference to the Cloud Storage component
const storage = require('#google-cloud/storage')();
// Get a reference to the BigQuery component
const bigquery = require('#google-cloud/bigquery')();
exports.iterableToBigquery = (event) => {
const file = event.data;
if (file.resourceState === 'not_exists') {
// This was a deletion event, we don't want to process this
return;
}
const importmetadata = {
autodetect: false,
sourceFormat: 'NEWLINE_DELIMITED_JSON'
};
let job;
// Loads data from a Google Cloud Storage file into the table
bigquery
.dataset("analytics")
.table("iterable")
.import(storage.bucket(file.bucket).file(file.name),importmetadata)
.then(results => {
job = results[0];
console.log(`Job ${job.id} started.`);
// Wait for the job to finish
return job;
})
.then(metadata => {
// Check the job's status for errors
const errors = metadata.status.errors;
if (errors && errors.length > 0) {
throw errors;
}
})
.then(() => {
console.log(`Job ${job.id} completed.`);
})
.catch(err => {
console.error('ERROR:', err);
});
};

How do I copy a node(object) in Firebase to another node?

I am having a node called Events in Firebase. It consists of child objects like: address, description, longitude, latitude. Before a user deletes an event node I want to copy it to the same database to a node called eventsDeleted.
This is the code for deleting the node:
removeEvent(eventId, groupId) {
return new Promise((resolve, reject)=> {
this.eventRef.child(groupId).child(eventId).remove();
resolve();
});
}
This is the code for creating the node:
addEvent(data:any) {
console.log('Form data', data.group);
let localEventRef = firebase.database().ref('events').child(data.group.split(',')[1]);
let storageRef = firebase.storage().ref();
let file = data.image;
let uploadTask = storageRef.child('eventImages/' + UuidGenerator.generateUUID()).put(file);
uploadTask.on('state_changed', function (snapshot) {
}, function (error) {
// Handle unsuccessful uploads
console.error(error);
}, function () {
// Handle successful uploads on complete
let downloadURL = uploadTask.snapshot.downloadURL;
let keyOfNewEvent = localEventRef.push(
new Event(
null,
firebase.app().auth().currentUser.uid,
data.description,
data.location.address,
0
)
).key;
localEventRef.child(keyOfNewEvent).update({eventId: keyOfNewEvent});
});
}
Never mind the code for uploading an image. I just need a way to copy that entire node if possible then paste it somewhere in the database. Thanks in advance.
When the user clicks delete, make sure to get the object that's being deleted, if you were to query your database for that object you can use the .once to retrieve the object otherwise you can just jump to the removeEvent function directly.
localEventRef.child(keyOfEvent).once("value", function(snapshot) {
//one data is returned, you can then call the removeEvent fn
let eventToBeRemoved = snapshot.val();
//assuming you have the eventid, groupid then.
removeEvent(eventId, groupId, eventToBeRemoved);
});
removeEvent(eventId, groupId, eventObjectToBeRemoved) {
//firebase 3.x comes with a promise method
firebase.database().ref('eventsDeleted/' + groupId + '/' + eventId )
.set({...eventObjectToBeRemoved})
.then(function () {
eventRef.child(groupId).child(eventId).remove();//you can now remove
})
.catch(function (error) {
console.error("error occured trying to add to deletedEvents", error);
});
});
}

Categories