OpenToken Package Readme Version 1.2.1 The OpenToken package is a facility for performing token analysis within the Ada language. It is designed to provide all the functionality of a traditional lexical analyzer generator, such as lex. But due to the magic of inheritance and runtime polymorphism it is implemented entirely in Ada as withed-in code. No precompilation step is required, and no messy tool-generated source code is created. Additionally, the technique of using classes of recognizers promises to make most token specifications as simple as making an easy to read procedure call. The most error prone part of generating analyzers, the token pattern matching, has been taken from the typical user's hands and placed into reusable classes. Over time I hope to see the addition of enough reusable recognizer classes that very few users will ever need to write a custom one. Ada's type safety features should also make misbehaving analyzers easier to debug. All this will hopefully add up to token analyzers that are much simpler and faster to create, easier to get working properly, and easier to understand. Manifest This version of the OpenToken package should come with the following files: gpl.html The license terms for this software. Please read it before using. gpl.txt The plaintext version of the licensing terms philosophical-gnu-sm.jpg A picture that goes with the licensing terms Readme.html This file Readme.txt The plaintext version of this file token-analyzer.adb token-analyzer.ads The token analyzer class token-based_integer_ada_style.ads Ada integer literal with base token-based_integer_ada_style.adb designation (eg: 16#123abc#) token-based_integer_java_style.ads Java integer literal with base token-based_integer_java_style.adb desingation token-based_real_ada_style.ads Ada real literal with base token-based_real_ada_style.adb designation token-character_set.ads Token recognizer for a string token-character_set.adb consisting of only characters in a given set token-csv_field.ads Token recognizer for a field in a token-csv_field.adb comma-separated value file (CSV) token-end_of_file.adb Token recognizer for the end of the token-end_of_file.ads input token-extended_digits.ads Token recognizer for hexidecimal token-extended_digits.adb digits. Mostly useful as a building block for other recognizers. token-graphic_character.ads token-graphic_character.adb Recognizer for a character literal token-identifier.adb Token recognizer for a typical token-identifier.ads space-delimited identifier token-integer.adb token-integer.ads Recognizer for an integer literal token-keyword.adb Recognizer for a given specific token-keyword.ads keyword token-line_comment.adb Recognizer for a line comment with a token-line_comment.ads specified introducer token-octal_escape.ads Recognizer for an octal escape token-octal_escape.adb sequence (eg: \003) token-real.adb Recognizer for a real (floating or token-real.ads fixed point) literal Recognizer for a non-letter token-separator.ads separator. Similar to keyword, but token-separator.adb does not worry about the token's case. token.ads Abstract parent class from which new token recognizers may be derived. Language_Lexers/ada_lexer.ads A lexical analyzer for Ada Language_Lexers/java_lexer.ads A lexical analyzer for Java Examples/ASU_Example_3_6/ A sample input file for the asu.txt asu_example_3_6 program Examples/ASU_Example_3_6/ An example program that implements asu_example_3_6.adb Example 3.6 from the Aho/Sethi/Ullman Compilers text. Examples/ASU_Example_3_6/ A makefile for building the example Makefile program from the sources Examples/ASU_Example_3_6/ relop_example_token.adb A token recognizer for a relational Examples/ASU_Example_3_6/ operator relop_example_token.ads Examples/Language_Lexer_Examples/ test_ada_lexer.adb Testing routine for the Ada lexer Examples/Language_Lexer_Examples/ test_java_lexer.adb Testing routine for the Java lexer Examples/Test/string_test.adb Test driver for the string token recognizer Docs/UsersGuide.html Docs/UsersGuide.txt The OpenToken User's Guide History Version 1.2.1 This version adds the CSV field token recognizer that was inadvertently left out of 1.2. This recognizer was designed to match fields in comma-separated value (CSV) files, which is a somewhat standard file format for databases and spreadsheets. Also, the extraneous CVS directories in the zip version of the distribution were removed. Version 1.2 The long-awaited string recognizer has been added. It is capable of recognizing both C and Ada-style strings. In addition, there are a great many submissions by Christoph Grein in this release. He contributed mostly complete lexical analyzers for both Java and Ada, along with all the extra token recognizers he needed to accomplish this feat. He didn't need as many extra recognizers as I would have thought he'd need. But even so, slightly less than 1/2 of the recognizers in this release were contributed by Chris (with a broken arm, no less!) Version 1.1 The main code change to this version is a default text feeder function that has been added to the analyzer. It reads its input from Ada.Text_IO.Current_Input, so you can change the file to whatever you want fairly easily. The capability to create and use your own feeder function still exists, but it should not be necessary in most cases. If you already have code that does this, it should still compile and work properly. The other addition is the first version of the OpenToken user's guide. All it contains right now is a user manual walking through the steps needed to make a simple token analyzer. Feedback and/or ideas on this are welcome. Version 1.0 This is the very first publicly released version. This package is based on work I did while working on the JPATS trainer for FlightSafety International. The germ of this idea came while I was trying to port a fairly ambitious, but fatally buggy Ada 83 token recognition package written for a previous simulator. But once I was done, I was rather suprised at the flexibility of the final product. Seeing the possible benefit to the community, and to the company through user-submitted enhancement and debugging, I suggested that this code be released as Open Source. They were open-minded enough to agree. Bravo! Future As it stands, I am developing and maintaining this package as part of my master's thesis. Thus you can count on a certain amount of progress in the next few months Things on my plate for the next release: * Look into changing the feeder function into a stream reference. I was unfamiliar with streams when I wrote this package. It looks like they would make several things much easier to deal with, but the devil's always in the details... * The Biggie: A parsing facility in the same vein as this token analysis facility! Things you can help with: * More recognizers - The more of these there are, the more useful this facility is. If you make 'em, please send 'em in! * Well isolated bug reports (or even fixes). Version 1.0 had been fairly thoroughly wrung out already. But there's a lot of newer code, so its quite likely you may find problems. Again, I hope you find this package useful for your needs. T.E.D. - dennison@telepath.com