‎‏‎‎‎‎‎‎‎‏‎‎‎‎‏‎‎‎‏‎‎‏‎‎‎‏‏‏‏‏‏‎‏‎‏‎‏‎‎‏‎‎‏‏‎‏‎‎‎‎‎‏‏‏‏‎‏‎‎‏‏‎‏‏‎‎‎‏‏‎‏‏‏‏‎‎‏‎‏‎‏‏‏‎‏‏Navigation‏

Master Datasmith



Introducing xLite

Over the next few weeks I'll publish short posts introducing various aspects and features of xLite.  Use the "Subscribe to posts" link below to be kept informed of new posts (or join me on Twitter).

xLiteWorkbookFunction

posted Feb 11, 2010 3:07 AM by Tom Gleeson   [ updated Feb 14, 2010 4:57 AM ]

xLIteWorkbookFunction(WorkbookName,Address,p1,p2,p3...)

xLiteWorkbookFunction is an extension of the xLite SQLite f() function.  The WorkbookName specifies the workbook to open, the Address specifies the cell to paste a RunID followed by (in cells to the right) the parameters p1 etc.

After that it operates just like the f() function, with xLiteReturn used to return a value to the calling statement.

Having returned, the xLteWorkbookFunction closes the called workbook (without saving).

My original use case for this functionality was to allow me to modularise xLite activities. So, for example, if building a dimensional model I could use a separate workbook for each dimension; isolating the dimension specific ETL logic and allowing for finer control over testing, and when in production, the run sequence for such logic (e.g. time dimension might run once a year, customer dimension code  every week).

The try it out, see the test_call_workbook_function.xls (there's also an .xlsm version) in the latest release ...

User Defined Functions u(),x(),f(), xLitePyScript

posted Dec 8, 2009 7:32 AM by Tom Gleeson   [ updated Feb 11, 2010 3:54 AM ]

u(functionName,p1,p2,p3...)

In previous posts I introduced the SQLite extension function u().This function is itself coded in VBA (See XliteUDFs module) and allows any VBA public function (i.e. a UDF) to be called from within a SQLite SQL statement.  While u() exposes the power of VBA and the Excel Object model, two further functions, x() and f() allow SQLite to take advantage of standard Excel worksheet formulas.

x(formula,p1,p2,p3...)

Where formula is a spreadsheet formula or built-in function, followed by a variable list of parameters. The passed formula can take advantage of xLite "placeholders" e.g. x("20+ :1 + :2",20,40) will return 80, x("upper(:1:)","abc") returns ABC.

The formula's text, with any placeholders replaced, is passed to Excel's Application object's Evaluate method, so not withstanding some of the limitations associated with this function (see here) it's possible to use most Excel formulas within a SQL statement.

Only built-in functions (i.e not User Defined Functions) can be called in this manner.  To call a UDF use the u() function.

When specifying the formula do not include the leading "=", this will ensure formula is evaluated inline along with the rest of the SQL statement. If preceded by "=", the formula is passed back as text. This can sometimes be useful either for testing purposes or to get around Application.Evaluate limitations.

f(address,p1,p2,p3...)

The f() function (aka Worksheet Function) pastes the function's p1,p2,p3.. parametes to the address specified. An additional "Run ID" integer parameter is prepended to the parameter list.  This Run ID can then be referenced by the xLiteReturn() function to pass back a result value to the calling SQL statement.  This allows for the development of "Worksheet Functions", i.e. functions callable by SQL but developed using standard cascading Excel formulas.  Good for testing complex calculation workbooks and for production use where data volumes are low (very VERY slow compared to the u() or x() functions).

Two further methods of developing functions afforded by xLite are xLitePyScript and xLiteWorkbook. I'll cover xLiteWorkbook in a future post.

xLitePyScript(address,p1,p2,p3)

Unlike u(),x() and f() which are SQlite user defined functions (i.e. are called inline on SQL statement like any other SQLite function), xLitePyScript is an Excel UDF and must be wrapped by the u() function when referenced in SQL (this is likely to change in near future, a py() function?).  The address can be either a string representation of an Excel range e.g. "Sheet2!A3:Sheet2!D6" or a named range.  If the range is a single cell the function will expect to find the full Python script within, if a multi-cell range then the script will utilise cells as Python indents.

The scripts are in fact anonymous parameterless Python functions. Parameters are passed via xLite "placeholders" which are replaced before the script is passed to the Python interpreter e.g.

if :2 < 40:
return :1*1.10
else:
return :1

It this case 2nd parameter is less than 40, return 1st parameter increased by 10% otherwise return unchanged.

To see the above in action download the test_worksheet_functions.xls  from here ...

Opening SQLite databases

posted Dec 5, 2009 10:54 AM by Tom Gleeson   [ updated Feb 7, 2010 8:25 AM ]

In the previous example (test SQL.xls) I introduced one method of opening a SQLite database within xLite, the xLiteLoadUnLoad function. This function is both useful & dangerous, it will replace the main (App) database's contents with the database provided by the first parameter. Typically the target database is the default empty in-memory (:memory:) database so no harm done, but if the default database was a file-based database, whose contents you wished to preserve, this might not be what you wanted. The second parameter if set to TRUE (FALSE being the default) works in the opposite direction i.e. replaces the specified database's contents with those of the main (App) database.

So, is there a 'less exciting' method of loading and working with database files. Several, in fact:

  • xLiteAttachDB(databaseName,alias); will attach a database to the App database and assign it an alias. See http://www.sqlite.org/lang_attach.html
  • Setting a custom property of xLiteDB to a database name in a 'driver' workbook, this database will then be used as the default App database. In test_databases.xls, the xLiteDB property is "Nwind.db".  As well as file based databases, the xLiteDB property can be set to ":temp:" or ":memory:", :temp: is like :memory: but will use RAM & a temporary file while :memory: only uses RAM.
  • The similar custom property of xLiteHelperDB controls what Helper (i.e. Logging) database to use.  In test_databases.xls this is set to "log.db".
  • Another option is to use the custom properties xLiteAttachDB & xLiteAttachDBAlias to auto-attach a database at startup.
  • Yet another method is provided by the custom properties xLiteDBSeed and xLiteHelperDBSeed. Does the the equivalent of xLiteLoadUnLoad(databaseName,FALSE) at startup.
There's also an xLiteClose function that is typically called by the 'driver' workbook on a before_close event, but can also be called at the end of script (as in test_databases.xls).  This will release resources held by the App & Helper databases and if they were seeded via xLiteDBSeed or XLiteHelperDBSeed, will save back any changes to the seed databases.

You will notice that the test_databases.xls example runs slower that the test_SQL.xls workbook, this is due to the use of an external log.db database rather than the in-memory default.  The logging functionality commits for each log record written (and being in beta mode there's lots of INFO messages being output). Because of SQLite's ACID nature, such frequent commits result in a slow application.  In fact, a general rule of SQLite; if SQLite appears slow, it's nearly always down to too frequent commits; in other words, use transactions. The logging functionality will be improved to allow for INFO-ERROR-FATAL levels to be configured which will reduce this commit hit in the future e.g. production configuration only FATAL or ERRORs would be recorded.

To see the above in action download test_databases.xls from here ...

Introducing the xLite.SQL function

posted Nov 24, 2009 12:33 PM by Tom Gleeson   [ updated Feb 11, 2010 3:56 AM ]

The xLite.SQL function is to some degree the main xLite function, it is used to issue SQL commands against the two SQLite instances available to xLite powered workbooks. The main database is the APP database, by default an in-memory instance, it's where most of the activity will be centred. The second is a HELPER instance, again by default in-memory, used to hold logs and to enable SQLite to perform certain activities that would not be possible without two separate database instances. How to configure what databases are opened etc. will be explained in a later release, for the purposes of this release, test data will be loaded into the main APP memory database from NWind.db, a SQLite version of the well know MS Access Northwind dataset.

Like (nearly) all xLite functions, xLite.SQL may be used as a UDF (User Defined Function - called in-cell via the = operator like normal Excel formulas) or called from a macro via Application.Run function, or as part of an XLiteScript script.  xLiteScript scripts are tables (sourced from a worksheet or SQLite) that describe a sequence of functions calls designed to perform a sequence-dependent task such as a typical ETL (Extract Transfrom & Load) operation. While it is possible to "chain" a sequence of UDFs & Excel formulas together to simulate a procedural program it can very difficult and the resulting "code" extremely fragile. Instead, the recommended method for automating xLite is to build a macro (Sub()) or an XlIteScript script that can be initiated by either a button or event (or by a single "firing" UDF).  

Avoiding having xLite functions called in a cascade of other xLite UDF calls is particularly important in the case of xLite.SQL as it breaks "the prime directive" of UDFs, the "no side effects" rule. Excel formulas and UDFs are not allowed to affect anything on a workbook except for the return value of the cell hosting the function. If you attempt to affect such an action in VBA code, the action will be ignored (or in extreme cases cause Excel to fail). The second parameter to the xLite.SQL function is a string representation of an Excel Cell address e.g. "Sheet2!B3", if the SQL is a Select statement the resulting dataset is pasted to the area of the workbook pointed to by that parameter. Through the magic of the SQLIte function "u", any VBA function (e.g. Select u('xLiteScript','Script!A1')) that also performs restricted activities can also be supported by xLite.SQL allow-side-affects functionality.

So, how does SQL circumvent this restriction? And should it? 

The xLite.SQL function when called in UDF mode (i.e via = operator in an Excel cell) hands over execution to a temporary instance of the SQLHelper class. This listens for Calculation Events on the host cell. The first such event will be triggered by the function's exit after the creation of the listening object, the created object will then execute the SQL and if a result set is produced and a 2nd parameter was specified, paste that result to the required location. It will also store a return code (OK or ABEND) or the result data itself (if a single cell select, i.e. no 2nd para) in a message-box associated with the original host cell. Finally, the listening object will reassign the same calling formula to the host cell. The effect of re-assigning the formula will be for the cell to call the xLite.SQL function again, but this time it will check the message-box for the result of the statement, destroy the listener object, and return the result to the calling cell. 

All of the above activities are perfectly "legal" even if they go against the "spirit of the law" (but then MS themselves sometimes do so too e.g. Data Tables). 99% of the time xLite.SQL caused side-effects will not be a problem, but there are times when it will fail, most likely in cases where the side-effect result of one SQL call is used to trigger a cascade of other side-effect causing xLite.SQL calls.

There are two guaranteed-no-side-effects SQL calling functions which can be used instead, xLite.xLiteSQL and xLite.xLiteCellSelect (they have the added advantage of being faster and of working in manual-calculation mode). xLite functions such as xLiteScript when used as UDFs are converted to xLite.SQL calls, others' simply refuse to run unless called by a macro or script.

So, you might ask, why bother, why not restrict xLite.SQL to no-side-effect environments like macros and scripts? Because it's so damn useful to use as an UDF. The original use-case for xLite was as a tool to enable large sets of tabular data to be investigated by a combination of the best of SQL and the best of Excel. I use the term investigated rather than analysed or transformed because a lot of a datasmith's job is about looking at datasets to determine their applicability to a given task or to track down problem data within.  Many such tasks are extreme "one-offs" with resulting work often handed over to professional IT or reporting analysts. Being able to use xLite UDFs as freely as one would use normal Excel formulas in such situations (where speed of response is often paramount) is, as the credit card ad puts it, priceless!


‹ Prev    1-4 of 4    Next ›