AnyScript and AnyASM Data Types v0.0.1 by Mark Ormston Created: December 23rd, 2006 Last Modified: June 10th, 2007 *************************************************************************************************** Data Types Many compilers and programming languages have ambiguous data types that seem to change from architecture to architecture. For example, in C/C++ languages, an int is 16bit on 16bit platforms, 32bit on 32bit platforms and 32bit OR 64bit on 64bit platforms (it actually acts similar to our Int or ProcInt data type). If you want to use a 16bit data type specifically, you can use short, but if you want a 32bit value, you use long for 16bit, int or long for 32bit, or int for some 64bit systems. Some languages take this a step further and support 16/32bit operations on the same values! Visual Basic 6, for example, uses 32bit signed integers for their "numbers", but if you use bitwise operations to set bit 15 (the sign bit for 16bit values), it sign-extends the resulting 16bit value to the entire 32bit value! To see what I mean, run the following: DIM a AS Integer a = &H0100 a = a OR &H8000 ' ASP/VB "transparently" changes &H8000 to &HFFFF8000 ' Even if you specify &H00008000, it becomes &HFFFF8000 ' To make this work properly, you'd have to use 32768 'a now has the value of &HFFFF8100 or -32512 A goal from the beginning with AnyScript is to alleviate ambiguity where we do not want it, and keep it where it is beneficial. This is done by "exact size" data types for when behavior of a specific data type is designed and "exact size or more" data types when the minimums and maximums are never expected to be reached but we want the processor to be able to use the data optimally. An example of using an "exact size" data type: We are going to use U16 in this example. For integer data types, a U or an S is used for the first letter (U for unsigned, S for signed), followed by the bit depth of the data type (16bits in this case, or 2 bytes). A U16 will ALWAYS be 16bits in length, no matter what system the code is being executed on. If you add one to our U16 and it's value is 0xFFFF, it becomes 0; if it was a U32 [32bit integer], it would have become 0x010000. In C/C++, the second scenario is EXACTLY what happened to code if you used an "unsigned int" in 16bit C/C++ then tried to use the same program in 32bit C/C++. While this might have been desirable, the fact that this acted different in the different compilers could spell big trouble for the program when trying to port it from one system to another. An example of an "exact size or more" data type: We're going to use U16Up in this example. The word "Up" is appended to any data type to mean, "Use the specified data type OR HIGHER". The "OR HIGHER" part only occurs if it is more optimal for the processor to work with data of a larger size. For example, say we want a counter. We know this counter will never go over 1000. If we used a U16 data type, on some 32bit processors (such as the x86), there is a timing "hit", or slow down in the CPU to be able to do math or use that data (on the x86 CPUs, in 32bit mode it costs an extra cycle to use 16bit operands, effectively doubling the time it takes to do many operations whereas 8bit and 32bit operations do not have this limitation). To counteract this, the assembler would "graduate" the value to a U32 so that it could operate at the optimal size for the processor. This ambiguous data type should always be used to fit the data type specified. If you use a U16Up, then you should never expect the value to be less than 0 or greater than 0xFFFF. If you need a larger value, then maybe a U32Up or U64Up is what you are looking to use. ___________________________________________________________________________________________________ Integer Data Types All general math operations (bit shifting, bit manipulation, add, subtract, divide, multiple and comparisons) work on any integer data type. ................................................................................................... Type Casting Between Integer Data Types using the typecast built-in function Generic type casting from one integer type to another always has the following effect: From an Unsigned value: Unsigned Value to Smaller Unsigned Value Upper bits are removed. For example, a U16 with a value of 0x948F, when converted to a U8, becomes 0x8F Unsigned Value to Larger Unsigned Value Value is zero-extended. For example, a U16 with a value of 0x948F, when converted to a U32, becomes 0x0000948F Unsigned Value to Smaller Signed Value Upper bits are removed. The sign is ignored in the conversion. For example, a U16 with a value of 0x948F, when converted to an S8, becomes 0x8F (-113) Unsigned Value to Same Size Signed Value No change. The sign is ignored in the conversion. For example, a U16 with a value of 0x948F (38,031), when converted to an S16, is still 0x948F (-27,505) Unsigned Value to Larger Signed Value The value is zero extended to the proper size. For example, a U16 with a value of 0x948F, when converted to an S32, becomes 0x0000948F From a Signed value: Signed Value to Smaller Signed Value Upper bits are removed. For example, an S16 with a value of 0x942F (-27,601), when converted to an S8, becomes 0x2F (47) Signed Value to Larger Signed Value Value is sign extended. For example, an S16 with a value of 0x942F (-27,601), when converted to an S32, becomes 0xFFFF942F (still -27,601) Signed Value to Smaller Unsigned Value Upper bits are removed. For example, an S16 with a value of 0x948F (-27,505), when converted to a U8, becomes 0x8F (143) Signed Value to Same Size Unsigned Value No change. The sign is ignored in the conversion. For example, an S16 with a value of 0x948F (-27,505), when converted to a U16, remains 0x948F (38,031) Signed Value to Larger Unsigned Value The value is sign extended to the proper size. For example, an S16 with a value of 0x948F (-27,505), when converted to a U32, becomes 0xFFFF948F (4,294,939,791) ................................................................................................... Static Integer Data Types These data types are ALWAYS GUARANTEED to be the same size from system to system. They force data to whatever size they designate. Name Bits Range Description --------------------------------------------------------------------------------------------------- U8 8 0 to 255 Unsigned 8bit value S8 8 -128 to 127 Signed 8bit value U16 16 0 to 65,535 Unsigned 16bit value S16 16 -32,768 to 32,767 Signed 16bit value U24 24* 0 to 16,777,215 Unsigned 24bit value S24 24* -8,388,608 to 8,388,607 Signed 24bit value U32 32 0 to 4,294,967,295 Unsigned 32bit value S32 32 -2,147,483,648 to 2,147,483,648 Signed 32bit value U48 48* 0 to 1,099,511,627,775 Unsigned 48bit value S48 48* -549,755,813,888 to Signed 48bit value 549,755,813,887 U64 64 0 to 18,446,744,073,709,551,615 Unsigned 64bit value S64 64 -9,223,372,036,854,775,808 to Signed 64bit value 9,223,372,036,854,775,807 U128 128 0 to over 3.4e+38 Unsigned 128bit value S128 128 below -1.7e+38 to over 1.7e+38 Signed 128bit value U256 256 0 to over 1.1e+77 Unsigned 256bit value S256 256 below -5.7e+76 to over 5.7e+76 Signed 256bit value * This data type is not natively supported by any processor that we are aware of and is generally slower than using a higher-end data type. It is provided for ease-of-use in the rare cases where such a data type might be useful but it is strongly recommended that they be avoided as much as possible. ................................................................................................... Dynamic Integer Data Types All of the static integer data types have a dynamic integer data type when the word 'Up' is appended to their name, such as U32Up, S16Up, etc. These values mean that the data being used in those variables is bounded by the range of their specific data type, but the compiler may choose to increase their size for faster processing. In general, a xxxUp data type will be the larger of their specific type or xProcInt. Note that unusual data types, such as U24, will always upgrade if specified as a U24Up unless there really is a processor out there somewhere that can handle three bytes of data at a time better than four. Examples: These examples are showing only the unsigned versions, though the same rules apply to the signed versions. On a computer that has 16bit processor integers, U8Up, U16, U16Up and UProcInt are all identical. U32Up values would be the same as U32 values, U64Up values would be the same as U64 values, etc. On a computer that has 32bit processor integers, U8Up, U16Up, U32, U32Up and UProcInt are all identical. U64Up values would be the same as U64 values, U128Up values would be the same as U128 values, etc. On a computer that has 64bit processor integers, U8Up, U16Up, U32Up, U64, U64Up and UProcInt are all identical. U128Up values would still be the same as U128 values. Name Bits Expected Range Description --------------------------------------------------------------------------------------------------- Char 8 0 - 255 A single character. Unlike other languages, this is always an unsigned 8bit value. PointerInt 16+ Unknown Size of a pointer on the processor UProcInt 16+ 0 to 65,535 Unsigned 16bit or higher processor-size integer ProcInt 16+ -32,768 to 32,767 Alias for SProcInt SProcInt 16+ -32,768 to 32,767 Signed 16bit or higher processor-size integer UInt 16+ 0 to 65,535 Alias for UProcInt Int 16+ -32,768 to 32,767 Alias for SProcInt SInt 16+ -32,768 to 32,767 Alias for SProcInt PointerInt is a special integer data type meant to be interchangable with pointers. They are useful for converting pointers to integers and back again. ProcInt, SProcInt and UProcInt are special data types. They are geared towards matching the most efficient integer size for the target processor, with a minimum of 16bits. ___________________________________________________________________________________________________ Floating Point Data Types AnyScript does not currently support floating point. It will support IEEE style floating point, 32bit and 64bit values. This would require floating point emulation libraries on certain systems that do not have a math coprocessor. The follow formats are planned to be supported: Name Bits Range Description --------------------------------------------------------------------------------------------------- F32 32 F64 64 F128 128 F256 256 ___________________________________________________________________________________________________ Special data types Name Description --------------------------------------------------------------------------------------------------- Boolean A true or false value A boolean is actually stored as a UProcInt, but using it, it can only have two values: true or false. These are stored in the value as all bits set (true) or all bits clear. Using a boolean in a math equation converts it to 1 if true, 0 if false. Likewise, converting a boolean to any integer data type follows the same rule, 1 if true and 0 if false. When assigning a value to a boolean, or type casting, the boolean becomes true if the value is non-zero, or false if it's zero. Nothing Nothing, Not a Value, Void This is the AnyASM/AnyScript equivalent of void from C/C++. ___________________________________________________________________________________________________ Special "class-like" Data Types Something that I believe is sorely missing in some languages, such as C/C++, is support for internally controlled class-like data types. This is why outside libraries are necessary in C/C++ to have smart string operations, such as are available in many scripting languages, as well as associative arrays or variant data types. In AnyScript, this is avoided by having some fundamental "class-like" data types built into the language. These are automatically initialized and deinitialized by the language itself, avoiding the need for programmers to have to deal with it themselves. It also makes syntax simpler and some special cases possible whereas in C/C++ they are not, such as switch statements on strings. For example, if we create a string data type in C++ and have the + operator configured to add values to the string, then we might think the following may be useful: U32 c; string s; s = "The value of c is: " + c; However, in C++, this would actually add the value of c to the POINTER for the const char * that contains the string, "The value of c is: ", which makes no sense programatically! However, the same sort of thing is possible in scripting languages, either as a special character: s = "The value of c is: " & c ' Visual Basic 6 $s = "The value of c is: " . $c; // PHP or as a normal plus (+): s = 'The value of c is: ' + c; // Javascript For each "Class-like" data type, see the full description below Name Description --------------------------------------------------------------------------------------------------- Function This object holds a pointer to a function, as well as storing all of the data necessary to represent the function. * KERBLUH - Needs work String A string, designed for maximum compatibility with other programming languages. A length is assigned to the string. The string's data array is NULL terminated so it may be used without modification in C/C++ functions. Examples: String s; s = "This is only a"; // Allocates the memory necessary then copies the string s += " test!"; // Then appends " test!" to the end of it SomeCFunc(s.Data); // Uses the NULL terminated data buffer in s to send to // to SomeCFunc StringMem This is a memory length prefixed string. It is similar to how strings are stored in various BASIC programming languages. Usually it is only used for data transfers to/from files, networks, etc. StringMem values are not designed to be written to, they are generally read-only, since they do not store how much memory is allocated to them. However, a String may be translated to a StringMem for data storage or transfer purposes. When a StringMem constant is declared, it will try to fit the data length into the smallest data type as possible. If the length of the string is 0 - 255, a StringMem8 is used, 256 - 65535 for a StringMem16 and 65536+ characters for a StringMem32. These constants DO NOT store what type they are, the compiler will handle the actual data storage format. The type can be found using the GetType language construct, such as: #If GetType(MyStringMem) == StringMem8 // code for StringMem8 #ElseIf GetType(MyStringMem) == StringMem16 // code for StringMem16 #Else // code for StringMem32 #EndIf Alternatively, it can be declared specifically as a StringMem8, StringMem16 or StringMem32. StringMem8 Same as a StringMem, but with an 8bit length. StringMem16 Same as a StringMem, but with a 16bit length. StringMem32 Same as a StringMem, but with a specifically 32bit length. StringNull This is a null-terminated string. It is exactly the same as how C/C++ programming languages store strings. The string data stored in a String data type is a StringNull for compatability with C/C++ and other functions. Variant An object that may contain any sort of data. A "data type" value is stored in the variant so the system knows how to handle the data. Note that while Objects contain Variants for their keys, a Variant may be an Object. This permits multi-tiered Objects. Examples: UProcInt q; String s; Variant v; // After initialization, v is the NULL data type q = 12; s = "Test"; v = q; // v becomes a UProcInt with a value of 12 v += q; // v is now 24 v = s; // v becomes a string with a value of "Test" v += s; // v is now "TestTest" v += q; // v is now "TestTest12" v = "x"; // v is now "x" v *= q; // v is now "xxxxxxxxxxxx" VariantObject A generic object data type, meaning that multiple named variant values may be assigned to it. This may be treated as an object type, or as an associative array. This data type may be used in two ways (this is similar to Javascript): varobject.key = "Something"; // Assigns a string to key of name "key" varobject["key"] = "Something"; // Identical to the above For example, in the following code, we create an object and assign 2 to a keyed value of "value": VariantObject o; o.value = 2; // Creates a key of "value", type "variant", set to a UProcInt with // a value of 2 o["value"] = 2;// An alternate form of the above that works just as well o["value"] = New VariantObject; // "value" is now a Variant Object o["value"]["extra"] = 2; // Now "extra" in the o["value"] object is a UProcInt // with a value of 2 o["value"] += 2;// Run time error, cannot add a UProcInt to a Variant Object! --------------------------------------------------------------------------------------------------- String Data Type NOTE: Currently there is only one string data type supported using ASCII (8bit). Unicode, UTF-8, etc. support will be added later. The String data type is designed for ease of use, as well as portability to other languages. The following operations exist for Strings: String s; s = Null; // Same as s = ""; s = 1; // Since s is declared as a String and not a Variant, it does not change type // In this case, the 1 is type cast to a string and that is put into s // s becomes "1" s = "123"; // "123" is a string in memory and this copies the data to s s += s; // s at this point would be "123123" s *= 2; // s at this point would be "123123123123" s -= "12"; // Looks for the first instance of "12" and removes it // s becomes "3123123123" s = "123" + 7; // s becomes "1237" s[2] = 'A'; // s becomes "12A7" s = "123" * 3; // s becomes "123123123" If (s != "123") { // Tests if s is not "123" and works as a scripter would expect, not a C/C++ // programmer Switch (s) { // Yes, it works! Case "123123123": // This is where we'd go s = "xyz"; Break; } } If (s) // Tests if s is not "" s = "Not Null"; If (!s) // Tests if s IS "" (testing if it is Null is the same) s = "Null"; If (s == 123) { // Converts s to a UProcInt, then compares it to 123 s = "NeverGetHere"; } If (s === 123) { // Triple equal means to test the data type. This is always false no matter what // is in s since s is a string and 123 is a numerical type s = "NeverGetHereEither"; } If (s.Length == 2) {// Test if the string has a length of 2 s = "Length of 2."; } Properties: ................................................................. U32Up Length Current length of the string U32Up Allocated Stores the number of bytes allocated to this string This is always Length + 1 or more StringNull *Data The data of the string Methods: .................................................................... N/A --------------------------------------------------------------------------------------------------- StringMem Data type This data type is a region in memory with a number for the data length, followed by that number of characters. This is similar to how some BASIC programming languages store strings. For example, the word "Hello" with a 32bit length on a little endian 32bit system would be stored as: 05h, 00h, 00h, 00h, 'H', 'e', 'l', 'l', 'o' The four numbers at the beginning are the 32bit equivalent of the number 5, the length of the string. There are special types that are practically identical to StringMems which are StringMem8, StringMem16 and StringMem32. These specifically use U8, U16 and U32 for lengths. These are designed for compatibility but also to use less storage space. Using our "Hello" example above, the following would be stored: As StringMem8: 05h, 'H', 'e', 'l', 'l', 'o' As StringMem16: 05h, 00h, 'H', 'e', 'l', 'l', 'o' As StringMem32: 05h, 00h, 00h, 00h, 'H', 'e', 'l', 'l', 'o' --------------------------------------------------------------------------------------------------- StringNull This is a region in memory that holds a NULL (0) terminated string. This data format is designed for compatibility with existing C/C++ style data. In general, it should never be used since the final AnyASM source will handle any system or language discrepencies. Again using "Hello" as an example, this would be stored in memory as: 'H', 'e', 'l', 'l', 'o', 00h --------------------------------------------------------------------------------------------------- Variant Data Type Properties: ................................................................. UProcInt DataType Holds the current type of data this variant is: BITS 0-7 - 00FF - Data Type: 00 - U8 01 - S8 02 - U16 03 - S16 04 - U32 05 - S32 06 - U64 07 - S64 08 - U128 09 - S128 0A - U256 0B - S256 20 - U24 21 - S24 22 - U48 23 - S48 80 - String 90 - VariantObject 91 - Variant C0 - Boolean E0 - Class (???) UProcInt PointerDepth Holds the number of pointers for this value. For example, a U32 **var; would have a PointerDepth of 2 KERBLUH - I really don't like this and would love suggestions for a better implementation UProcInt DimensionNum Holds the number of array dimensions for this value, if any. U8 * Data Holds the data of the variant U32Up Allocated Stores the number of bytes allocated to this variant Methods: .................................................................... N/A --------------------------------------------------------------------------------------------------- VariantObject Data Type Properties: ................................................................. Methods: .................................................................... N/A *************************************************************************************************** Advanced Variable Access ___________________________________________________________________________________________________ Variable Overloading This allows the space occupied by a variable to be accessed anyway desired, similar to the flexibility of assembly language but available in high-level AnyScript. This is done through pseudo-arrays or lists of data that the variable occupies. Each numerical variable has one or more values of lesser and equal data types, depending on what data type it mirrors. These are treated as pseudo-properties with the same name as the data type and an array length based on how many can fit into the value. If no array value is used, then the first element is used automatically. For example, let's take a look at a U32 called n. This is a 32bit (4 byte) region in memory. With these 32bits, we can have: 4 U8s, 4 S8s, 2 U16s, 2 S16s, 1 U24, 1 S24, 1 U32, 1 S32, or 1 F32. So, to access it as four U8s, we can use x.U8[0] (always accesses the low byte, regardless of endianness), x.U8[1], x.U8[2] and x.U8[3] (always accesses the high byte, regardless of endianness) to directly access the 4 U8s. We can use x.U8 to access x.U8[0] since the first element is used when no array element is specified. For U24, S24, U32, S32 and F32 with x, we can use either [0] (x.F32[0]) or just the name of the data type (x.F32). The following code is legal without a type cast: U8 somevar; U32 x; x = 1234567; // Since 1234567 is 0x0012D687, somevar would become 0x12 or 18 somevar = x.U8[2]; // To do the same thing in C/C++ style programming, you would have to do: somevar = ((U8 *)&x)[2]; // EXCEPT if this was on a big endian system, then you'd have to do: somevar = ((U8 *)&x)[1]; The order of the data types in memory is the same as is used on a Little Endian system, because it's the order that makes the most sense. Back to our x example, if we were to show the same blocks of memory in a list, it'd look like: MSB ..................... LSB | U8[3] | U8[2] | U8[1] | U8[0] | | S8[3] | S8[2] | S8[1] | S8[0] | | U16[1] | U16[0] | | S16[1] | S16[0] | |-------| U24[0] | |-------| S24[0] | | U32[0] | | S32[0] | | F32[0] | Using this model, we can see that: x.U8[0] = red; x.U8[1] = green; x.U8[2] = blue; is the same as: x = red + (green << 8) + (blue << 16); except it's usually faster because it's only three reads and three writes instead of three reads, two bit shifts, two adds and a write. Also, to further prove that this order is useful, this can be used for down-type casting purposes (in fact, it's the preferred method): U8 somevar; U32 x; x = 1234567; // Since 1234567 is 0x0012D687, somevar would become 0x87 or 135 somevar = x.U8; // Grabs x.U8[0], which does the same thing as type casting x to a U8. On Big Endian systems, such as the MC68K, the memory values are stored BACKWARDS. Again with x, x.U8[0] = 1; stores the 1 in the 4th byte so that the full value of the memory union is correct (1) and so the same code will work identically on both Big Endian and Little Endian systems. Physical byte order in memory CAN NOT be accessed this way; in fact, it should never be important or necessary at all with the design of AnyScript and AnyASM to make it so that your software does not need to be dependent on that. There still may be cases where it is absolutely necessary to access memory on a byte by byte basis and this will be worked out in a later version of Any. Note that the only valid types to use are static integral data types where the size is always given. Using a non-integral type will result in an error. The only exception is you are overloading a variable that is not a static size to another type that is the same size. ProcInt somevar; U32 x; String somestring; x = somevar.UProcInt; // OK: somevar is a ProcInt, so overloading it to UProcInt doesn't // cause problems somevar = x.UProcInt; // ERROR: Non-static type used. // x only has 32bit allocated to it, and a UProcInt might be // more than 32bit in size somestring = somevar.String; // ERROR: Object type used If you attempt to access a invalid size or an out of bounds index, an error will occur: ProcInt somevar; U32 x; x = somevar.U32; // ERROR: ProcInts are assumed to be 16bit! x = somevar.U8[2]; // ERROR: There are only two 8bit indices for 16bit values (including ProcInt) These may be used cumulatively. NOTE: Using cumulative access is STRONGLY discouraged. You can easily access misaligned data this way which causes slowdowns on many processors. ProcInt somevar; U32 x; somevar = x.U8[1].U16; // Accesses a U16 based on the second lowest byte. For arrays, you must first pick which element you are accessing. U32 x; ProcInt somearray[10][20]; U32 anotherarray[10]; x = somearray[5][4].U8; // Gets the low byte of the specified array item x = anotherarray.U8[20]; // ERROR: Though there are more than 20 bytes in the array, this is not // the correct method to read from the array as a byte array // This would try to access the 21st byte of the first element in the // array which is impossible since it only has 4 bytes x = (anotherarray.U8)[20]; // Ok: Accesses the 21st byte in somearray x = (anotherarray.U8)[90]; // ERROR: Only 10 * 4 (32bit) = 40 bytes are allocated for // anotherarray, so this will cause an out-of-bounds error x = (somearray.U8)[20]; // ERROR: somearray is a ProcInt so we do not know the size of each // element ahead of time and this would work differently from one // computer to the next! Use the method above, where you specify the // element you want to access first. Note that for array types, you can overload the data types upwards: U8 somebuffer[100]; U32 x; x = somebuffer[12].U32; // Grabs the U32 that starts at somebuffer[12] // This would be identical to: x = ((U32 *)somebuffer)[3]; Note that a variable value may be used for the sub sections. For example, this is valid: U32 x; UProcInt somevar; For (somevar = 0; somevar < 4; somevar++) Print(x.U8[somevar]); This works by taking somevar, multiplying it by the byte size of the type specified (U8 in this case is 1 byte), then adding that to the base of x in little endian systems or subtracting it from the position of the last value (U8 in this case) of x on big endian systems. As you can see, using an invalid value is definitely not desired so BE SURE you test for oversized values before using this! This will always cause a low level warning because of how dangerous it can be. Register values MAY NOT be accessed this way! A register may not have any means of directly accessing random bytes of data. Only the lowest value is ever accessible: Register U32 x; ProcInt somevar; somevar = x.U8; // Valid! Registers have to provide SOME method of single byte access so this will work somevar = x.U8[2]; // ERROR: Invalid variable overload on a register variable ___________________________________________________________________________________________________ Pointers A pointer is a variable that holds the address of a value, instead of holding the value itself. Pointers are incredibly powerful in that you can pass where something is stored in memory as opposed to passing the whole value. This has further value when the value is particularly long, such as an image or a long string of text. Because AnyScript is based on C/C++, pointers are handled almost the same in AnyScript as they are in C/C++. Objects being pointed to ARE NOT automatically initialized/deinitialized because they may be assigned to existing objects. To declare a pointer to an object or data type, you simply use a * after the type but before the variable name, such as: ProcInt *somevar; This initializes somevar with enough memory to hold the pointer, but DOES NOT initialize the memory that it is pointing to. In this example, a variable, somevar, is declared as a pointer to one or more ProcInt data elements. To access multiple data elements, use brackets like a normal array: somevar[2]; // Access the third ProcInt in memory that somevar is pointing to This syntax is also used to point to an array of an unknown length. To declare a pointer to an array with a specific length, use the * then put parenthesis around variable name and the number of elements in brackets, such as: ProcInt *(somevar[10]); NOTE: This is different than C/C++, which would declare the same thing as "int (*somevar)[10];" This syntax is very confusing, because it looks like it would declare an array of 10 pointers, whereas the AnyScript version makes it very obvious that we are pointing to what's inside the parenthesis. This declares somevar to be a single pointer to a 10 element ProcInt array. In this special case, accessing somevar like an array actually accesses the array itself: somevar[2]; // Access the third ProcInt in the array that somevar is pointing to While this acts essentially identical to just "ProcInt *somevar", it places a restriction in that accessing the values of somevar only permits using somevar[0] to somevar[9], Using somevar[10] or higher would result in an error. Note that there is no checking done on variable access to parameters of an array, such as: somevar[a]; // Access element a Note that this syntax is useful when pointing to two or more dimensional arrays: ProcInt *(somevar[10][20][20]); This means that somevar is a pointer to a three-dimensional array with 10 elements in the first set, 20 in the second set and 20 in the last set (a total of 4,000 elements). For example: somevar[4][3][10]; // Accesses element 10 + ((3 + (4 * 20)) * 20) or 1670 To declare an array of unknown length of pointers, you can use two or more *s: ProcInt **somevar; // Declare an array of pointers to ProcInts ProcInt ***anothervar; // Declare an array of pointers to an array of pointers to ProcInts Note that in our anothervar example, the first element of the array points to another set of pointers, which in turn point to the final value being pointed to. Examples of accessing these kinds of pointers are: somevar[1]; // This accesses the second pointer in the array somevar[1][4]; // This accesses the fifth value pointed to by the second pointer in the array anothervar[2]; // Accesses the third array of pointers in the array anothervar[2][1]; // Accesses the second pointer in the third array of pointers anothervar[2][1][4]; // Accesses the fifth value pointed to by the second pointer in the third array of pointers To declare an array of a specific length of pointers, you can declare the number of elements like you would a normal array: ProcInt *somevar[10]; // Create an array of 10 pointers You can access individual pointers by using a single set of brackets, or specific values under specific pointers by using two sets of brackets: somevar[1]; // This accesses the second pointer in the array somevar[1][4]; // This accesses the fifth value pointed to by the second pointer in the array This syntax also works for multiple dimensions of arrays: ProcInt *anothervar[10][20]; Examples of using pointers: ProcInt *somevar, *anothervar; ProcInt q; q = somevar; // Sets q to the value that somevar is pointing to anothervar = somevar; // Sets the value that anothervar is pointing to, // to the value that somevar is pointing to somevar = &anothervar; // Sets somevar to point to the same value that anothervar is // pointing to somevar = &&anothervar; // This will error since somevar is pointing to a ProcInt, not a pointer This also works with Memory Union-style access: somevar.U8 = 1; // Sets the low byte of the first ProcInt of somevar to 1 somevar[4].U16 = 3; // Sets the low word of the fifth ProcInt of somevar to 3 ___________________________________________________________________________________________________ References A reference is basically a pointer to a variable. It is grabbed using & before the value. ProcInt somevar; ProcInt *anothervar; anothervar = &somevar; // anothervar becomes a pointer to somevar *anothervar = 1; // This changes somevar to 1 This can also be used on Objects, such as Strings, to tell the compiler to modify the pointer and avoid the operator overrides that might be in place: String *somestring; U8 *somepointer; String anotherstring = "123"; &somestring = &anotherstring; // This makes somestring point to anotherstring, in essence they // become the same thing somestring = "789"; // anotherstring also becomes "789" because both of point to the same place somepointer = &&somestring; // Since somestring is an object type and & accesses it's pointer, // the special && is needed to access a pointer to it's pointer &somestring = Null; // somestring now points to nothing. This does not affect anotherstring ___________________________________________________________________________________________________ Arrays // This declares an array of 10 ProcInt somevar[10]; // This is an example of an inline array with unknown length // Structs that are declared with an inline array with an unknown length cannot be used with SizeOf // or be used as an array of values since their size is unknown. Struct Sample { ProcInt numelements; ProcInt elements[]; }; ___________________________________________________________________________________________________ Structs ___________________________________________________________________________________________________ Objects/Classes All classes hold a hidden pointer to a definition of the class. *************************************************************************************************** Differences between C/C++ and AnyScript Most C/C++ code is very specific to the architecture or operating system it is written for. Memory access, variable types, etc. change vastly when moving from one platform to another. Since AnyScript is designed NOT to be architecture dependent, there were a number of changes made to C/C++, as well as a series of simplifications and redesigns of confusing or inconsistent matters. ___________________________________________________________________________________________________ Reserved Names/Words In C/C++, almost everything in the base language uses lowercase. However, some libraries use all upper case, all lower case, camelized text, first letter caps, or the dreaded lower-case type then first letter caps (eg, bSomeVar for a boolean) leading to a confusing set of inconsistent looking functions, operations, etc. #define someVar 1 #include void MyFunction() { int q; void *x = NULL; AnotherFunc(q + someVar, x); } To help with this problem, ALL of AnyScript's internal reserved names/words are all first letter caps (I like to call it Happy Caps). #Define SOMEVAR 1 #Include "program.anyscript" Nothing MyFunction() { Int q; U8 *x = Null; AnotherFunc(q + SOMEVAR, x); } In your code examples, you can case your own constants, variables, etc. however you want. In fact, if you are ever worried about using reserved names, just use lowercase cause nothing built into AnyScript will ever be all lower case. Every single AnyScript library function, method, object, etc. will *ALWAYS* use happy caps. If you really like using some confusing C/C++ constructs, you can always use the "CPP.AnyScript" script, which basically just declares constants for keywords for you, like: #Define #define #Define #Define #include #Include #Define for For #Define void Nothing : : All variable types in C/C++ DO NOT exist in AnyScript in the same way. The following is a list of the best bet (but not perfect) for changing from C/C++ variables to AnyScript variables: C/C++ AnyScript void Nothing void * U8 * bool Boolean char Char unsigned char U8 int Int unsigned int UInt short S16 short int S16 unsigned short U16 unsigned short int U16 long S32Up unsigned long U32Up long int S32Up unsigned long int U32Up long long S64Up unsigned long long U64Up long long int S64Up unsigned long long int U64Up float F32 double F64 The following keywords have changed beyond just case changing: C/C++ AnyScript #elif #ElseIf #ifdef #If Defined #ifndef #If !Defined #undef #Undefine const Constant typedef DefineType The following are not currently supported, but support is planned to be added in the future: catch operator template throw try volatile The following are not supported and there is no plans for support in the future: asm dynamic_cast explicit mutable namespace reinterpret_cast static_cast typeid typename