Tuesday, August 14, 2012

pushData

The best way to explain what pushData does is by the words of the maker of original pushData , Andrew Stillman.

Here it is a nice picture explaining it:

pushDataAbout

 

Andrew has been my mentor on all the projects I did this summer and a very good and thorough teacher.

pushData was his idea and out there is a published version, the difference between the previous version and this is basically that here the user gets a graphical interface, this version uses ScriptDb and as suggested by users of the previous version there is a real time transfer option included.

 

How it looks

All parts of the look are not final, but here it is.

After installing the app the onOpen handler will add an additional menu to your spreadsheet:

pushData1

The above entries will do:

Start pushData – will show the user interface where you can define new processes, edit/delete existing processes do an immediate data push (overriding the time interval) and see all processes defined

About pushData - will display the picture first shown in this post

Update database now – will check for changes in your source sheets and update the database accordingly

Clear all data and triggers – will delete whatever you and the script made and give you a fresh start

Package push procedures for others to copy – will generate Google Apps Script code for you that you can share. Say you have a great system of processes and want to share it with someone, this is what you need.

Import copy procedures – will use the code generated by the function above to initialize the processes. NOTE: the spreadsheet names have to bee the same as from the original user

Manage user tracker settings – it is great to allow user tracking, the tracking will not allow anyone to steal the users identity but will help New Vision.

 

Start Screen

pushData2

The grid will provide data on any active processes, Push all data now button will provide an immediate push and New Push Procedure will get the user to the interface for defining a new procedure.

 

The starting point of creating a new procedure is to pick the source spreadsheet, afterwards the app will allow for picking the source sheet. After the source sheet has been picked the app will provide a grid with an overview of the first 5 rows in the source sheet.

pushData3

 

The next step is to chose the headers to be pushed. The app provides a set of list boxes, every box will contain all the headers from the source sheet. The order in which the headers are picked is the order they will appear in the destination sheet.

pushData4

After the headers have been picked the user is prompted to choose the destination spreadsheet and sheet. It will also allow for picking the starting point for the data to be written. A range check will be done, if there is a possibility for overwriting data in the destination sheet the user will get a warning. A grid will provide a overview of data currently in the destination sheet.

pushData5 

Finally the user will pick the push interval, options go from 1 to 24 hours for all processes, but will allow real time pushing for one process.

pushData7

After hitting the finish button the app will save the source sheet to ScriptDb, and preform the first data push taking the user back to the start screen,

pushData7

 

pushData is most definitely the best app I produced this summer of several reasons. On of the main is that Andrew took a very active role in developing this.

Additionally the previous version of pushData has received some feedback which are implemented in the new version.

 

Cheers

Big data sets and ScriptDb (hack)

First of all ScriptDb just may be the best thing that happened to GAS. It allows for fast data retrieval and storage(more or less).

But ScriptDb has its problems, and probably the biggest one is saving big data sets. This was a problem I encountered when working on pushData.

A quick intro:

pushData is supposed to save spreadsheets to ScriptDb and use that data to copy entries from some source spreadsheets to some destination spreadsheets.

To make it more simple, imagine having a huge spreadsheets with student data. The data going from first and last name all the way to place of birth.

But you also need smaller sheets which will have only parts of the data for each student. Say a roster for all the students in the school. pushData will allow you to define a process

for copying data, what data should be copied from where to where and in which order you want the columns to appear. You can decide when the copying should happen. Or rather the time interval.

The time interval can go from 1 to 24 hours. And for one process the time can be set to real time.(as soon as a change has been made to the source sheet it will be reflected to the destination sheet).

 

Now my problem is easily defined, I have 1,2,3..5? huge sheets a few smaller ones and all the data from those sheets has to be saved to ScriptDb. If you have ever tried to save a big sheet to ScriptDb you can see the problem.

  At the time of writing this saving that data is very very difficult, I am confident that GAS engineers will have a solution very soon but until then it is a real quest to do such a thing.

 

My initial approach was to create objects, every row would be represented by an object, the object keys would be the column names and all those object would be neatly saved to the database. (Used something very similar to the function in the documentation). The problem is that saving objects one by one is just very slow, so very slow it exceeds the execution time for any big data set. That I knew and my plan was to use the saveBatch function, however saveBatch will be able to save only small arrays of objects. By small I mean something ~20 objects at a time. I did try slicing and saving only parts but that did not go so well either.

NOTE: “Data Storage error” is the error most frequently shown, but it can (and does) mean very different things.

After a few days trying to solve this and a few periods of thinking: “Ooook it just is not possible” I did find a solution. This solution however will render some ScriptDb possibilities useless. But it still was working and it was acceptable for the script I am building.

 

My Solution (a simplified version)

My solution is a divide and conquer approach. Also I do not save rows any more I save columns.

The data from the sheet is taken by getDataRange().getValues and using 2DArrayLib’s transpose function I shift the rows and columns. 

I go though the array row by row turn the whole row into a JSON string(just temporarily) and find its length. (There is another approach without turning the data to strings but this will work much faster, and I needed fast)

In this process the largest array is found.

 

1 var db=ScriptDb.getMyDb();
2
3 var sheetData=sheet.getDataRange().getValues();
4 var keys=sheetData[0];
5
6 checkDates(sheetData);
7
8 sheetData=ArrayLib.transpose(sheetData);
9
10 var largestSize=0;
11 var indexOfLargest=0;
12
13 for(var i=0; i<sheetData.length; i++){
14 var thisSize=(Utilities.jsonStringify(sheetData[i]).length);
15 if(thisSize>largestSize){
16 indexOfLargest=i;
17 largestSize=thisSize;
18 }
19 }


checkDates(sheetData) is a function that will find columns containing dates and convert the dates to a ScriptDb-friendly format.


Now the "guess” work comes. I assume that all the rows have the same, largest, length. Knowing that there is a limit on the object size ScriptDb will save  I calculate how many objects it will take to save the whole sheet. From that I find the number of elements I can save for every sheet column per object.


1 var elementPerObject;
2 var numberOfObjects;
3 var storePerObject=5*8*1024; //to get size in bytes
4 var additional=0;
5 var type="sheetData";
6
7 numberOfObjects=parseInt(Math.ceil( (largestSize*keys.length+additional) / storePerObject));
8 elementPerObject=parseInt( Math.floor( (sheetData[indexOfLargest].length)/numberOfObjects) );
9

 



This is by no means a surgically correct calculation, far from it. But it will find a SAFE number of elements you can save. It kind of works like sending files over a network when you think of it (mush less sophisticated thou)


All that is left now is to loop from 0 to the number of objects needed and save those parts of the sheet.


1 for(var count=0; count<numberOfObjects; count++){
2
3 var thisItem={};
4 thisItem.headers=keys;
5 thisItem.itemType="sheetData";
6 thisItem.itemId=itemId;
7 thisItem.sheetName=sheet.getName();
8 thisItem.spreadsheetId=spreadsheet.getId();
9 itemId++;
10
11 for(var i=0; i<keys.length; i++){
12 thisItem[keys[i]]=Utilities.jsonStringify(sheetData[i].slice(begin, end));
13 }
14
15 items.push(thisItem);
16
17 begin=end;
18 end+=elementPerObject;
19
20 }

Initially begin and end are set to 0 and elementsPerObject. As you can see it will slice the array and load the parts to the object keys saving pushing all of the objects to an array. Every run of the loop will make a new object.


The number of objects can (and will) get somewhat large, some of the space will get wasted. But it will never cause the script to fail because an object was to large, and that is what I wanted to achieve.


Finally the array of items created has to be saved:


1 begin=0;
2 end=20;
3 if(end>items.length)
4 end=items.length;
5
6 while(end<=items.length){
7
8 var subItems=items.slice(begin, end);
9 var results = db.saveBatch(subItems, false);
10
11 if (!db.allOk(results)) {
12 // partial or no success
13 for (var i = 0 ; i < results.length; i++) {
14 Logger.log(i);
15 var item = results[i];
16 if ((typeof item.success) == 'function' && !item.success()) {
17 db.save(item);
18 }
19 }
20 }
21
22 if(end==items.length)
23 break;
24 else{
25
26 begin=end;
27 end+=18;
28
29 if(end>items.length)
30 end=items.length;
31 }
32
33 }

Again the array is sliced into more digestible pieces and saved, I used the fail-safe from the ScriptDb documentation to make sure all the elements will be saved.


And that is it, the sheet is saved. This function will save a sheet 10 000 rows x 13 columns in 10-14 seconds (tested it for a few days). And even better when using this to write data it finishes in 6-10 seconds.


Giving the possibility of saving several really big sheets Smile.


 


Usually when you save something you want to retrieve it, this is how I do it. The function will depend on the saving data function…


My function queries the database for all the objects associated with a sheet (made in the previous function), sorting the results by item ID. (In my case it was crucial to write the data in the same order it originally was)


Again in this case I am not interested in all the columns from the original sheet( it is not excluded)


 


1 var db=ScriptDb.getMyDb();
2
3 var items=db.query({itemType:"sheetData", spreadsheetId:sourceSpreadsheet.getId(), sheetName:sourceSheet}).sortBy("itemId", db.NUMERIC);
4
5 var completeData=[];
6
7 for(var i=0; i<headers.length; i++){
8
9 completeData[i]=[];
10 completeData[i][0]=headers[i];
11 }

In this case headers is an array filled with column names I want to have on my new Sheet . E.G. the original sheet could have “Name, Surname, ID, Date of Birth” but I only want to have “Name” and “ID” on my new sheet, then headers will contain “Name, ID”.


After setting every first element in the new array to the column name I go through the saved items and join what is inside the row with the values the new object contains.


1 while(items.hasNext()){
2
3
4 var thisItem=items.next();
5
6 for(var i=0; i<headers.length; i++){
7 if(!completeData[i])
8 completeData[i]=[];
9
10 mergeArray(completeData[i], Utilities.jsonParse(thisItem[headers[i]]));
11 }
12
13 }

Finally using 2DArayLib I again transpose the array and write the data to a new sheet. Simple enough, effective and in some cases useless. But for this app and similar ones this approach is gold.


 


Cheers

Long time no read

Hi there,

It has been a while since the last post here but that does not mean that I was slacking of. In the silence period 2 new apps have been written.

A lite weight Reservation app and pushData. Reservation app is meant to be embedded to your site and allow for your visitors to make reservations for any events you are hosting.

pushData is to be very short an app allowing you to copy certain columns from source to destination spreadsheets.  More on pushData in the next post.

 

Reservations

Again reservations is meant to be placed somewhere on your site. It is simple but allows for basic options. Your visitor can make a reservation, edit it or delete it.

Reservations will ask you to provide your calendar ID (the calendar is where you make events available for reserving). You allow for a certain number of available seats and the app will take care of the rest.

(a complete manual on how to set this app is available here: Manual )

The app will look like this:

app1

The settings button will be available only for the site owner.(obvious but I wanted to confirm that).

Now when the app is first installed you will want to go to the settings part and set it up and that will look like this:

First of all it will ask for the company/school name and for the calendar containing the event data.

After that it will ask for required fields, that is what the user needs to provide to make a reservation. There is no filed required by the app but you enter what ever it is you want.(the limit is 40 characters).

Finally it will ask for a confirmation message. That is after the visitor makes a reservation an email will be sent to them with the message you enter here.

A few neat options have been added here, you can use the key word “Name” in the message (if name is one of the fields you ask from the user) and the message will contain the visitors name.

And after hitting finish the setup is done.

 

What the visitor can see

First thing would be making a reservation, so if the user hits the “Make reservation” button it will take him to a new user interface.

app2

The green buttons represent dates with seats left, red buttons are displaying events that are full.  The app will show 5 events per page (if available). It will check for 120 days counting from current date to search for events.

It takes the user through the process step buy step, first being to choose a date from the list. After that has been picked it will show text boxes for information you asked for.

app3

When all the fields contain data the submit button will appear.

app4

The process for the user is finished after clicking submit. The app however will open/make a spreadsheet called reservation for you and fill the data provided by the user.

It will make a sheet for every date with reservation information for the visitors who picked that date and one sheet where you can preview all the reservations made. The app will also edit the calendar event decreasing available seats for the event by 1.

 

Canceling reservations is easy for the visitor, after hitting the “Cancel Reservation” button they will be asked to enter their reservation ID. The ID is randomly generated for every reservation and sent to the user in an email.

All they have to do is copy/paste the ID to a text box hit “Cancel” and the reservation will be cancelled. Along it the data from you spreadsheet will be deleted and the number of available seats will be increased by 1.

app5

 

Editing reservations is almost as easy.

First the event ID has to be provided. (the UI will look as the UI for cancellation). If the ID is recognized the app will provide a new interface.

The new UI will be identical to the interface for making a reservation except the text boxes will be filled with previously provided data.

When a change has been made to a reservation the data in your calendar and spreadsheet will be changed accordingly.

 

That is it about the Reservation app, small, simple and powerful.

 

Cheers