We’ve got a lot of things covered during the previous chapters and now it’s time for a dessert. We won’t be learning anything conceptually new in this chapter but I’m sure that it will bring you some fun.
Let’s recall what we’ve done last time. We’ve completed the model of MajiroArc game archives and so far it looks like this (I’ve added names and data to make it slightly more obvious):
And it certainly works. However, we can make it more strict. As you know, MajiroArc has two extra offsets that we don’t use: nameOffset and dataOffset. They point to the beginning of names and data nodes. But what happens if they don’t?
Right now nothing. Still, it’s not a good thing to silently ignore possible errors. Lightpath provides two special nodes: FAIL and WARN which create feedback for the user (or ourselves). Here’s an example:
lpWARN "Something's going the wrong way!" % bad SP thing
Path parameter for FAIL and WARN is the message displayed to the user when such a node matches. Other than this both kinds of nodes are regular contexts – with possible condition, once/order/tentative markers and even children.
If we’re using WARN and user decides to ignore it, modelling will continue – WARN node will be entered if it has children (and modelling will go on like usual even if it’s now inside of WARN), otherwise it will act as if WARN was END – ending parent context. This is the reason I’ve put WARN inside an unnamed context – so if we’re ignoring the message modelling will continue instead of exiting root context and skipping remaining data.
0000h: 4D 61 6A 69 72 6F 41 72 MajiroAr 0008h: 63 56 31 2E 30 30 30 00 cV1.000. 0010h: 04 00 00 00 44 00 00 00 ....D... 0018h: 6C 00 00 00 10 ED D7 01 l....í×.
Two highlighted fields correspond to name offset and data offset. Put anything you like there – e.g. FF FF FF FF – and Remodell the archive (if you’re feeling lazy here’s a sample). You’ll get two warnings like this:
lp.? WARN "File table ended on wrong offset!" ? offset nameOffset$start toint <>
If you do this and ignore the warning you will immediately get another warning saying that modelling has stopped before reaching the end of input stream – which is because WARN acts as END if it has no children.
lp.? WARN "File table ended on wrong offset!" ? offset nameOffset$start toint <> "foo" =
MajiroArc out of the way we can now think about «deployment» (sorry, couldn’t resist using this cool term any programming book must use). We have a rugged script that does excellent job tearing down game archives and showing us all their inner affairs. We can even Save individual files to disk. But it’s not user-friendly. What if we have hundreds or thousands of entries we need to save? Going one-by-one is not an option.
Luckily for us Lexplore has a feature that lets us save nodes in bulk. To get started click on the Extract button to bring up this window (you need to have successfully modelled file for this button to work):
- Data path
- This is what we will be extracting. Remember our earlier talk about node references, or paths? That’s what we use here. Since our data resides under file nodes let’s type in full path to these data nodes: files\file\data$start – don’t forget about $start because if we omit property part $inner will be assumed but our data nodes don’t have it and we’ll get empty files.
- We’ll get back to this in a moment, let’s leave it empty for now.
- Name path
- Similar to Data path, this input must be filled by the path to our name nodes. If we leave it empty or check Unique checkbox then our files will be saved under unique names that are unlikely to tell us much. This path is relative to data node so all we have to type here is: .name. We could also type .name$inner but it’s assumed by default. Below the input we’ll see helpful tips on whether Lexplore could locate this node and if it could we’ll be able to click on the Properties button to bring up that name node’s Node Properties window.
- Once again, let’s leave this empty for now.
- That’s just the location where we want our data to be extracted to.
- File name format
- Sometimes name nodes (specified by Name path) don’t contain full file name – for example, some archives contain files of specific type and omit their extension from file name table. If this is the case this input can be used to give such files an extension. It can also be used to give files arbitrary names with question mark (?) standing for name node value. We don’t have to change this in our case.
- Open target directory
- Fairly obvious, if checked Lexplore will open directory with new files for us.
Japanese visual novels make great examples in many respects. One of them is that they don’t use Latin symbols alone. In our sample archive all files were named like sr013.mjo – but quite often you’ll come over Unicode or Shift-JIS-encoded file names. Here is a sample MajiroArc that uses Shift-JIS encoding to store its file names.
When Lexplore reads our name nodes it treats those strings as having Latin-1, or ASCII, encoding unless they have been converted to a proper Unicode string. ASCII encoding contains basically only Latin symbols, digits and punctuation. It’s not nearly enough to hold all Japanese «alphabet» (if kanji and kana can be called this way). Therefore Japanese used Shift-JIS (that was created before Unicode was/could be widely used) to encode their texts in a special way.
You don’t have to know details of how different character encodings operate; for our problem it’s enough to know that there are different ways to represent the same text and that some of these ways, like ASCII, can’t hold all possible symbols and with them such symbols become garbage.
lpEND? iteration \fileCount$start toint =
Here, RPN expression is what follows the question mark. It turns out that Transform inputs accept exactly the same kind of expression. Lightpath has a bunch of useful functions and among them – charset. It converts string from one encoding (codepage, charset – in most cases all of these terms are interchangible) to another. We need to convert names from Shift-JIS to Unicode and our expression will go like this:
If we input this exression into name node’s Transform box we will get proper file names. Fill in other fields – Data path and name path – and press Extract or hit Enter. You will see that 日本語.mjo has appeared in the target directory – which means our little trick has worked.
Lexplore will autodetect some common data/name node paths and save us from typing them all the time. Before opening this dialog select any data node in the tree and when you press on Extract you will see that data and name paths are already set for us.
There is much more to expressions. With them you can not just extrcat data, you can also decode it in parallel. We will see this in action later in this chapter because version two of MajiroArc applies simple XOR protection for its scenarios. How handy for our learning!
That batch extraction trick is great. However, it still feels a bit «geeky». What if you want other people to be able to extract archvies (or other data) without all the knowledge of Lightpath? Just like other extractors do – open file, select location to save the files, done.
The Lightpath project has a few other useful utilities. One of them is Lexplore’s twin sister – lpx. This is a console tool that offers most of Lexplore functionality via command-line interface. While this sounds even more geeky than using Lexplore give me a moment to explain why it’s so good.
First of all download the tool. If you’re unfamiliar with Windows command line then go to Start → Run (or press Windows key + R) and type cmd.exe or just cmd. Press OK. A black window will open. That’s called command-line window. Here we’ll type one line of a command, press Enter and see how it gets. We won’t be able to type in anything else while that command is operating.
Extract lpx.exe in any directory and put our sample scenario.arc from the last chapter there. Now in the CL window type this:
Replace the path string with path to your directory containing lpx.exe. ALternatively, instead of typing it manually you can drag & drop the folder from Windows Explorer to the CL window. Press Enter once you have the proper line. Command prompt (text before >) should change to reflect your new path.
If your system has multiple drives and your directory is on the one different from where Windows is installed you have to type d: (where d stands for your directory’s drive letter) and press Enter before or after executing the cd command. If you don’t the command prompt won’t change.
Now type lpx and press Enter. It will output a help screen that will begin with a blue text banner with Lightpath 0.79, followed by Usage. You can dive into reading it but right now we need just one command:
shlpx s.lp files path. This is command-line equivalent of Lexplore’s Extract dialog.
Normally, command-line parameters are separated one from another with spaces. If a parameter itself contains spaces we wrap it in quotes. For example, the following command executes program myapp and gives it 3 parameters: first, second spaced, 3:
shmyapp first "second spaced" 3
- This is name of our Lightpath script file: MjArc.lp.
- Name of the data file we’re going to extract: scenario.arc.
- An expression that in Lexplore’s dialog is broken into multiple inputs: data/name path/transformation. As the help text tells us we need to write Data path, then equals sign (=) and then Name path. That is: files\file\data$start=.name.
On Windows, file names are case-insensitive. This means we can run program myapp by typing mYApp, MYapp, MYAPP and other variations. Similarly, if a parameter we’re writing is a name of file – like MjArc.lp – we can write it in any character case as well: mjarc.lp, MJARC.lp, MJarC.lP and so on.
shlpx MjArc.lp scenario.arc files\file\data$start=.name
As you see command line (the part after prompt, >) can wrap onto multiple lines but it doesn’t affect anything. On my system the line was so long that two last symbols – me – are wrapped onto the next line. This is normal and often inevitable if a program accepts many parameters.
You might notice that my screenshot contains «MajiroArc v1 parser» and more lines that might not show up for you. lpx reads script’s signature and preamble and outputs them before extracting files or doing other task. You can add them to your own script and show off to your users when we package it – check out Chapter 1 if you’ve already forgotten what signature and preamble are.
You can repeat this operation for the Shift-JIS archive and find that it works too – with file names being garbage. However, lpx doesn’t suggest any way to enter our transformation expression (that was 'shift-jis' charset). Or does it?
Recall the help screen again and in particular the following command:
shlpx ! s.lp. With its help we can create our own build of lpx with our own model(s) that will be able to extract data files that those models can handle.
We are taking advantage of the scripts being containers of arbitrary and not-so-arbitrary data grouped into sections and subsections. Our models are sections of one kind but there are others: like Options above – which has Ini-like format: set of key/value pairs, keys separated from values with equality signs (=). Spaces around the signs are optional but I’m adding it for better readability. Keys might contain dots to group related options under the same namespace.
The Supported subsection lets lpx detect if this model is capable of processing particular file. As you see it’s subsection of Model – which means it has the same format. We can write as complex Lightpath model there as necessary; lpx will try to apply it to a data file and if no errors occur during modelling it will assume that this model is capable of handling that data file. The only «error» that’s allowed is model returning before reaching end of data.
shlpx ! MjArc.lp
Loading script mjarc.lp... ---------------------------------------- MajiroArc v1 parser ---------------------------------------- This model will process MjArc files and show you file names, offsets, data and w ill even let you extract them! Removing built-in models... Embedding new script: mjarc.lp...
And you’ll also see that new mjarc.exe program has appeared in the same folder. If you run it without parameters you’ll get the same help screen… but wait, there’s one command that looks useful:
shlpx files. It doesn’t take any model parameter, only data file names. What if we run it?
And there is indeed. There’s one little option called lpx.DefaultFiles that lets you specify comma-separated file names (actually, wildcarads) that lpx should look up in the current directory when it’s invoked without parameters. If one or more such files exist they will be extracted without asking any questions or showing the help text.
lp== Options == lpx.DataPath = files\file\data$start lpx.DataExpr = lpx.NamePath = .name lpx.NameExpr = 'shift-jis' charset lpx.DefaultFiles = scenario*.arc
Notice the added last line. Wildcard is a simple string «template» which has two special symbols: * that can stand for zero, one or more characters and ? that can stand for exactly one character. This means that a?c will match abc, a9c, a_c and so on while a*c will match all of these plus ac, a-----c, aaaccc and so on.
In our case, scenario*.arc matches scenario.arc, scenario1.arc, scenario-sjs.arc and everything else. In respect to Majiro game engine this makes sense – it typically names archives like scenario1234.arc, with a numeric suffix.
So let us re-embed new script with this option set. You can either use «clean» lpx or your own mjarc.exe – both have the same features except that mjarc.exe doesn’t contain any default models that lpx might have been shipped with.
Once you have new mjarc.exe try placing it in a folder with one or more archive files. Now run it – without parameters, you can even just double-click on it in Explorer like you run any other program. Done right you will see all scenario*.arc files extracted into their own folders.