In this article, we will expose a concrete example on how to populate a <table> by using the DataTables jQuery plugin on the client side and data served from a CSV file on CGI mode.
The context
During the implementation of the new FAF-Drugs server, we needed to find a way to display a columns sortable table with up to 50.000 rows. Letting Firefox displays this page did not appear to be the good idea -- except if you want your browser to use 2Gb RAM and wait 2 minutes for each column sorting ... -- Thus, we needed a paginated view of sorted data. Considering that data are stored on the server side, the obvious solution appears to use the DataTables jQuery plugin. The originality of this post relies on fact that data are not stored in a (SQL) database but in a csv file and served on-demand through a python CGI script.
Client side: DataTables integration
This part is inspired from the DataTables example page.
<HTML> part
In your HTML page, you will need to load required javascript plugins and styles in the <header> section:
1 2 3 4 5 6 | <style type="text/css" title="currentStyle">
@import "/media/css/demo_page.css";
@import "/media/css/demo_table.css";
</style>
<script type="text/javascript" src="/media/js/jquery.js"></script>
<script type="text/javascript" src="/media/js/jquery.dataTables.js"></script>
|
Your table which will be populated should look like:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 | <table class="display" id="results">
<thead>
<tr>
<th>#</th>
<th>ID</th>
<th>MW</th>
...
<th>Status</th>
</tr>
</thead>
<tbody>
<tr>
<td colspan="27" class="dataTables_empty">Loading data from server</td>
</tr>
</tbody>
<tfoot>
<tr>
<th>#</th>
<th>ID</th>
<th>MW</th>
...
<th>Status</th>
</tr>
</tfoot>
</table>
|
jQuery part
Now make your #example <table> a dataTable instance include this js code snippet to your page.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 | $(document).ready(function() {
var oTable = $('#results').dataTable( {
"bProcessing": true,
"bServerSide": true,
"sAjaxSource": "http://foo.bar/cgi-bin/csv.cgi?url=http://foo.bar/tmp/spam.csv",
"sScrollX": "100%",
"sPaginationType": "full_numbers",
"fnRowCallback": function( nRow, aData, iDisplayIndex, iDisplayIndexFull ) {
switch ( aData[26] ) {
case "Accepted":
$(nRow).addClass("gradeA");
break;
case "Rejected":
$(nRow).addClass("gradeX");
break;
default:
break;
}
return nRow;
}
} );
} );
|
So let's explain datatables' options we used in this case:
- bProcessing: displays a processing indicator while loading data.
- bServerSide: data are processed on the server side.
- sAjaxSource: the url to the script that will receive the request and send json data. Note that datatables will send all required parameters for you to the server, you only need to give the url and additionnal request parameters here.
- sScrollX: as your table is larger than our page, this option makes your table scrollable along the x-axis and the table width is 100% of the allowed size.
- sPaginationType: pagination style, nothing crucial.
- fnRowCallback: we use this callback feature to stylize rows with a particular state (Accepted or Rejected) dicted by the 27th column. If you are not familiar with callbacks, note that this function is called when AJAX request is done.
For the full list of datatable object initialization options, please refer to http://www.datatables.net/usage/
Server side: serve CSV data
When loading data into the table, datatables instance will make AJAJ calls to the script http://foo.bar/cgi-bin/csv.cgi. This script should be able to parse datatables queries and serve data according to DataTables API. In our case, instead of using a database, we had data stored in a csv files (generated for each FAF-Drugs job). We decided to use existing data (up to 50.000 rows), and it appears that Python's CSV module is really surprising by its efficiency: the whole process takes only about 300 ms. The CGI python code is the following:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 | import time
# starting time
st = time.time()
# ------------------------------- import
import csv
import json
import sys
import logging
import operator
import urllib2
import os.path
import string
import re
import cgi
# ------------------------------- vars
PARMS = {
"furl":["url","str"],
"sidx":["iSortCol_0","int"],
"start":["iDisplayStart","int"],
"rows":["iDisplayLength","int"],
"sord":["sSortDir_0","str"],
"echo":["sEcho","int"],
"rexp":["sSearch","str"],
}
# The 27 columns csv file types
CTYPES = (int,str,float,float,float,float,int,int,int,int,int,int,int,int,int,int,int,int,float,int,float,str,str,int,float,int,str)
# ------------------------------- config
LOG_FILENAME = '/tmp/csv2json.log' # The log file
logging.basicConfig( filename=LOG_FILENAME, level=logging.DEBUG )
# ------------------------------- starts
logging.info( ":"*50+" %s" % time.strftime("%c") )
logging.info( "Job started" )
# ------------------------------- core
def cast(values, types=CTYPES):
"""
Cast list data according to predefined types
Note that values and types should have the same length!
@param values: values to cast
@param types : types for values
@return: casted values
"""
return [t(v) for t, v in zip(types, values)]
def format(values, types=CTYPES):
"""
Float values are converted to strings with two decimals.
ADVISORY: only use this function for final data to jsonify since
data type is lost
@param values: values to cast
@param types : types for values
@return: formated values
"""
return ["%.2f"%v if t == float else v for t,v in zip(types, values)]
def search(values, rexp):
"""
Float values are converted to strings with two decimals.
ADVISORY: only use this function for final data to jsonify since
data type is lost
@param values: values to cast
@param types : types for values
@return: formated values
"""
for v in values:
if re.search(rexp,str(v)):
logging.debug( "REGEXP (%s) Match for %s" % (rexp,str(v)) )
return True
return False
def csv2json( furl, sidx, start=1, rows=50, sord="asc", echo=0, rexp=None, pfn=None ):
"""
csv2json
* Parse www csv file
* Sort data according to the input key
* Extract a subset of data
* Encode the subset in JSON
* Print the data
@param furl : csv file url (can be a local path)
@param sidx : column used to sort data
@param start: data chunck/page number (default: 1)
@param rows : how many values do we display (default: 50)
@param sord : sort order, set to desc for reversed (default: True)
@param echo : simple internal iterator used by datatables for rendering
@param pfn : pickle file name -- only used if PICKLE = True (default: None)
@return:None
"""
logging.info( "Input file is: %s"%furl )
# read csv data
t = time.time()
c = csv.reader( urllib2.urlopen(furl), delimiter=';' )
logging.info( "Data loaded in %8.3f s" % (time.time() - t) )
# First raw corresponds to parameters names
k = c.next()
logging.info( "Input csv keys are " + str(k) )
# filter data (search case)
if rexp != "None": # rexp is casted to str
t = time.time()
c = filter(lambda x: search(x,rexp), list(c))
logging.info( "Data filtred in %8.3f s" % (time.time() - t) )
# cast data -- note that we need to sort the whole list for
# sorting consistency
t = time.time()
c = map(lambda x: cast(x,types=CTYPES), list(c))
logging.info( "Data casted in %8.3f s" % (time.time() - t) )
# sort data according to the item rank
t = time.time()
r = True if sord == "desc" else False
s = sorted( c, key=operator.itemgetter( sidx ), reverse=r )
logging.info( "Data sorted in %8.3f s" % (time.time() - t) )
# select data range
t = time.time()
max = len(s)
ff = start if start else 0
tt = ff+rows if ff+rows < max else max
logging.info( "Data range is: %d-%d" % (ff,tt) )
logging.info( "Data selected in %8.3f s" % (time.time() - t) )
# format floats to avoid X decimal
t = time.time()
f = map(lambda x: format(x, types=CTYPES), s[ff:tt])
logging.info( "Floats formated in %8.3f s" % (time.time() - t) )
# format data
j = {
"sEcho":echo,
"iTotalRecords":str(len(s)),
"iTotalDisplayRecords":str(len(s)),
"sColumns":string.join(k,","),
"aaData":f
}
# json data
t = time.time()
# This first line header allows to execute XMLHttpRequest from
# anywhere, it inactivates web browsers CORS (Cross-Origin
# Resource Sharing) -- It has been commented for security
# considerations
# print "Access-Control-Allow-Origin: *"
# print "Content-type: application/json\n"
print json.dumps(j)
logging.info( "Data encoded (json) in %8.3f s" % (time.time() - t) )
logging.debug( "Data encoded (json) content %s" % str(json.dumps(j)) )
# ------------------------------- ends
logging.info( "Elapsed time %8.3f s" % (time.time() - st) )
return
# -- Form --
def formErr( msg, debugMsg = "", debug = False ):
"""
Print an error message (stdout/HTML) and exit
@param msg: the error message
"""
print "Content-type: text/html\n\n"
print "<b>An error occured while processing your request: %s!</b>" % str(msg)
if debug:
print "<p>"
print debugMsg
print "</p>"
sys.exit(1)
def checkForm( theForm, parms ):
"""
Check each form parms and fill the dict ligand
@param theForm: input form
@param parms: input form parameters keys
"""
for aParm in parms:
try:
if theForm.has_key( PARMS[aParm][0] ) and theForm[ PARMS[aParm][0] ].value:
val = str(theForm[ PARMS[aParm][0] ].value)
else:
val = None
logging.debug("checkForm: aParm (%s) val (%s)" % (aParm,val) )
# Set var according to dict keys and cast values
exec "%s=%s('%s')" % ( aParm, PARMS[aParm][1], val)
except:
formErr( "wrong value for parameter %s" % aParm )
return furl,sidx,start,rows,sord,echo,rexp
# -- Main --
if __name__=='__main__':
# Parse the input request
form = cgi.FieldStorage()
# Check submitted parms
furl, sidx, start, rows, sord, echo, rexp = checkForm( form, PARMS.keys() )
# Process
try:
csv2json( furl, sidx, start=start, rows=rows, sord=sord, echo=echo, rexp=rexp )
except:
formErr( "contact an administrator" )
|
I think the code is sufficiently commented to stop boring you here! Do not hesitate to give me feedback about this post.

Comments (0)
Comments are not allowed for this article.