====Writing the Python Script====
The script you can uses with the [[module_python|Python]] [[analytic_modules|module]] converts an //incoming// JSON file that is created from your [[graph|graph]], to an //outgoing// JSON file that is delivered to the user.
This 'transformation' can be //anything// - from a simple calculation to a complete Machine Learning analysis.
===How It Works===
Your script is called by ARDI, and is sent the data from your graph through standard input, as a JSON file.
As an example, if you had both a **Temperature** and a **Pressure** [[python_output|Python Input]] node, you'd be given a file like the one below...
{
"batches": {
0: {
"sharpstart": 0,
"sharpend": 0,
"data": [
{
"time": 20993044,
"temperature": 22.5,
"pressure": 2912
}
.....
}
}
}
The data is broken up into **batches**, which then have a list of **samples**, with a timestamp (in UTC Epoch Seconds) and values for each of your inputs.
You can process this data any way you'd like.
Your **output** should also be a JSON file, which will be merged with the existing analytic query data and returned to the user.
===Inputs===
You can specify what //inputs// your Python script needs using our [[python special comments|special comments]].
You can also optionally choose the **caching method** (covered later).
===Uncached Example Code===
import sys
import json
#These are the input and cache channels.
#+Customer
#+Speed
#+Cache=stops|batch:int
#Read in the standard input
content = sys.stdin.read()
#Convert the input to JSON
content = json.loads(content)
items = []
for batch in content['batches']:
customer = None
maxspeed = None
#Scan every second of data in the batch
for n in batch['data']:
if customer is None:
#This is the first time we've run this.
customer = n['customer']
maxspeed = n['speed']
else:
if n['customer'] == customer:
if n['speed'] > maxspeed:
maxspeed = n['speed']
else:
thing = {}
thing['customer'] = customer
thing['maxspeed'] = maxspeed
items.append(thing)
customer = n['customer']
maxspeed = n['speed']
thing = {}
thing['customer'] = customer
thing['maxspeed'] = maxspeed
items.append(thing)
#Assign the stoppages to our array
final['customer_orders'] = items
#Write the JSON formatted final data
print(json.dumps(final))
This code goes through every **batch**, and through every **point of data** in that batch.
It then returns the maximum speed value it could find for each customer.
===Caching===
If you're not using caching, all of your data will come in as a single 'batch'.
If you would like to take advantage of caching because you're creating **events**, your data might be broken up into two or more batches - these are the //un-cached regions of time that need to be filled in//.
It is important that you...
* Prevent any data from carrying over from the previous batch to the next batch. You usually do this by clearing any temporary values that persist between loops in your analytic.
* Add a "Complete" property to each of your events, and only set it to '1' if you're //sure// that you have captured the **entire event** rather than just the start or end.
* Pay attention to the **sharpstart** and **sharpend** properties of the batch. A 'sharp' start or end indicates that it runs straight into an already-cached time-period. If your event is already under-way at the start of a batch where //sharpstart// is 1 and it continues all the way through to end end where //sharpend// is also 1, the event should be marked as complete.
//The code below records each 'stop' event (where the speed is < 0.1m/s), and records the active batch number at that moment.//
===Caching Example Code===
import sys
import json
#These are the input and cache channels.
#+Batch
#+Speed
#+Cache=stops|batch:int
#Read in the standard input
content = sys.stdin.read()
#Convert the input to JSON
content = json.loads(content)
stoppages = []
final = {}
#Process each batch.
for batch in content['batches']:
stopstart = None
first = True
#Scan every second of data in the batch
for n in batch['data']:
if abs(n['speed']) < 0.1:
#We've stopped and have to do something.
stopstart = {}
stopstart['start'] = n['time']
stopstart['complete'] = 0
stopstart['batch'] = n['batch']
#The first event might be very incomplete.
if first == True:
if batch['sharpstart'] == 0:
stopstart['partial'] = 1;
else:
if stopstart is not None:
#Write out the end of the eevnt.
stopstart['end'] = n['time']
stopstart['complete'] = 1
stoppages.append(stopstart)
stopstart = None
first = False
#If we finished with an incomplete event, make sure it's marked incomplete.
if stopstart is not None:
stopstart['end'] = batch['data'][len(batch['data'])-1]['time']
stopstart['complete'] = 0
if batch['sharpend'] == 1:
if 'partial' not in stopstart:
stopstart['complete'] = 1
#Assign the stoppages to our array
final['stops'] = stoppages
#Write the JSON formatted final data
print(json.dumps(final))