====Writing the Python Script==== The script you can uses with the [[module_python|Python]] [[analytic_modules|module]] converts an //incoming// JSON file that is created from your [[graph|graph]], to an //outgoing// JSON file that is delivered to the user. This 'transformation' can be //anything// - from a simple calculation to a complete Machine Learning analysis. ===How It Works=== Your script is called by ARDI, and is sent the data from your graph through standard input, as a JSON file. As an example, if you had both a **Temperature** and a **Pressure** [[python_output|Python Input]] node, you'd be given a file like the one below... { "batches": { 0: { "sharpstart": 0, "sharpend": 0, "data": [ { "time": 20993044, "temperature": 22.5, "pressure": 2912 } ..... } } } The data is broken up into **batches**, which then have a list of **samples**, with a timestamp (in UTC Epoch Seconds) and values for each of your inputs. You can process this data any way you'd like. Your **output** should also be a JSON file, which will be merged with the existing analytic query data and returned to the user. ===Inputs=== You can specify what //inputs// your Python script needs using our [[python special comments|special comments]]. You can also optionally choose the **caching method** (covered later). ===Uncached Example Code=== import sys import json #These are the input and cache channels. #+Customer #+Speed #+Cache=stops|batch:int #Read in the standard input content = sys.stdin.read() #Convert the input to JSON content = json.loads(content) items = [] for batch in content['batches']: customer = None maxspeed = None #Scan every second of data in the batch for n in batch['data']: if customer is None: #This is the first time we've run this. customer = n['customer'] maxspeed = n['speed'] else: if n['customer'] == customer: if n['speed'] > maxspeed: maxspeed = n['speed'] else: thing = {} thing['customer'] = customer thing['maxspeed'] = maxspeed items.append(thing) customer = n['customer'] maxspeed = n['speed'] thing = {} thing['customer'] = customer thing['maxspeed'] = maxspeed items.append(thing) #Assign the stoppages to our array final['customer_orders'] = items #Write the JSON formatted final data print(json.dumps(final)) This code goes through every **batch**, and through every **point of data** in that batch. It then returns the maximum speed value it could find for each customer. ===Caching=== If you're not using caching, all of your data will come in as a single 'batch'. If you would like to take advantage of caching because you're creating **events**, your data might be broken up into two or more batches - these are the //un-cached regions of time that need to be filled in//. It is important that you... * Prevent any data from carrying over from the previous batch to the next batch. You usually do this by clearing any temporary values that persist between loops in your analytic. * Add a "Complete" property to each of your events, and only set it to '1' if you're //sure// that you have captured the **entire event** rather than just the start or end. * Pay attention to the **sharpstart** and **sharpend** properties of the batch. A 'sharp' start or end indicates that it runs straight into an already-cached time-period. If your event is already under-way at the start of a batch where //sharpstart// is 1 and it continues all the way through to end end where //sharpend// is also 1, the event should be marked as complete. //The code below records each 'stop' event (where the speed is < 0.1m/s), and records the active batch number at that moment.// ===Caching Example Code=== import sys import json #These are the input and cache channels. #+Batch #+Speed #+Cache=stops|batch:int #Read in the standard input content = sys.stdin.read() #Convert the input to JSON content = json.loads(content) stoppages = [] final = {} #Process each batch. for batch in content['batches']: stopstart = None first = True #Scan every second of data in the batch for n in batch['data']: if abs(n['speed']) < 0.1: #We've stopped and have to do something. stopstart = {} stopstart['start'] = n['time'] stopstart['complete'] = 0 stopstart['batch'] = n['batch'] #The first event might be very incomplete. if first == True: if batch['sharpstart'] == 0: stopstart['partial'] = 1; else: if stopstart is not None: #Write out the end of the eevnt. stopstart['end'] = n['time'] stopstart['complete'] = 1 stoppages.append(stopstart) stopstart = None first = False #If we finished with an incomplete event, make sure it's marked incomplete. if stopstart is not None: stopstart['end'] = batch['data'][len(batch['data'])-1]['time'] stopstart['complete'] = 0 if batch['sharpend'] == 1: if 'partial' not in stopstart: stopstart['complete'] = 1 #Assign the stoppages to our array final['stops'] = stoppages #Write the JSON formatted final data print(json.dumps(final))