Don't have an account? Register now to chat, post, use our tools, and much more.
Latest Headlines
Online Users
There are 109 users online: 9 members, 78 guests and 22 bots. Members: hellninjas, HOMER-16, legodude, LuxenD, shundra9. Bots: Spinn3r (2), Magpie Crawler (2), Googlebot (18).
RSS & Social Media
SAX
You must log in to view the SAX chat widget
|
| Author |
Message |
|
elfprince13

OVER NINE THOUSAND!

Joined: 23 May 2005 Posts: 10232 Location: A galaxy far far away......
|
Posted: 16 Mar 2012 11:43:40 pm Post subject: |
|
|
| shkaboinka wrote: | | elfprince13 wrote: | | If you can't do it dynamically it seems quite like a very odd language feature. Static aliasing isn't any different than a less powerful version of #define macros in C/C++. On the other hand dynamically, I thought it was kind of a neat syntax to make pointers a little more intuitive. |
Sorry about that. No, I don't hide pointers; but the compiler will insert dereferences/addresses to make the RHS suitable for the LHS (e.g. "ptr = nonptr" will convert to "ptr = &nonptr", which makes "by reference" arguments easier) ... see the overview). When I tried providing macros in Antidisassemblage, I came to the conclusion that they cause more problems than they solve. Instead, the compiler uses data-flow analysis to trace values and predetermine as much as possible; but you can also use the $ operator require this (i.e it's an error if it cannot, and it's a hint to be more vigorous -- e.g. "interpreted" loops are unwound without question). Thus, full-fledged functions/variables/constructs can be used as "smart" macros (static checks on everything versus blindly inserting code which can be misused).
You'd think that would void the need for any kind of static aliasing (and in most cases, yes); but the "@" is just necessary to attach variables (and functions, etc.) to explicit addresses, so as to (for example) attach a name to where the TIOS's stores the cursor position, and use it directly as a variable! This is a large motivation for the underlying structure of things to be as close to machine representation as possible (so that you can map directly to arrays, routines/precompiled-functions, or even structs designed to match an external storage format). |
Fair enough.
| Quote: | | elfprince13 wrote: | | Then why not (x,y)@a[1:3] or a={0,x@1,y@2,3}? |
That second one (a={0,x@1,y@2,3}) is one of the options I suggested initially, so your suggestion (and response to how it fits the context better) makes me lean toward it. As for the first part, I like the semantics of it, but I am still debating about how tuples might mix into variable declarations (if at all), so how about this:
- Allow (a={0,x@1,y@2,3}), so that large lists of values can be mapped directly into arrays.
- Allow a[1:3] as to grab a tuple out of an array (as a shorthand; it would be exactly equivalent to (a[1],a[2]))
- If we treat a tuple as "one thing" (this paradigm essentially removes lists from assignments, etc.), then we could go off of the current syntax for declaring individual variables:
Code: // If tuples were allowed in variable declarations:
byte (a,b)@(x,y) = (1,2);
byte (a=1, b=2)@(x,y);
byte (a@x, b@y) = (1,2);
byte a@x=1, b@y=2; // <-- This is standard (e.g. not relating to tuples)
// (Of course, any "(1,2)" can be replaced with "array[n:n+2]")
|
In python, the fundamental difference (from an API perspective) between tuples and lists/arrays is that tuples are immutable. I see no particular reason why you couldn't unify those with a byte {x,y}@a[1:3] sort of deal. _________________ StickFigure Graphic Productions || VSHI: Vermont Sustainable Heating Initiative
 |
|
| Back to top |
|
|
shkaboinka

Power User

Joined: 30 Jun 2010 Posts: 371 Location: Spokane, WA
|
Posted: 19 Mar 2012 12:20:23 am Post subject: |
|
|
After much thought on tuples/comma-lists, and I think the following changes offer a simple solution:
(0) A "tuple" in OPIA will be an abstract representation solely for the purpose of grouping values together (i.e. no tuple-variables: A function would return a "tuple" of values rather than one value which is "a tuple"). Any list of comma-separated values is a "tuple", and a tuple is just a list of values.
(1) Assignment (and all else) has precedence over a comma; but parenthesis may be used to group things as normal.
(2) A Tuple does not "bind" into a separate entity (i.e. stays "unpacked"). A "tuple within a tuple" is just one larger tuple.
Code: byte a, b = c, d; // same as: byte a, (b = c), d;
byte (a, b) = (c, d); // same as: byte a = c, b = d;
byte a = (((b))); // same as: byte a = b; (as expected)
(a,b,c) = ((x,y),z); // same as: (a,b,c) = (x,y,z)
(a,b) = ((x,y),z); // INCORRECT: (x,y) is still TWO items
The benefit of rule (2) is simplicity: no "lists of lists" or "packing" to deal with (this also means that if you embed a call to [a function which returns two values] inside of another function call, then it fills two places of the argument list). The main motivation for this is so that (((X))) resolves to X as expected (rather than a tuple-tuple-tuple of X).
An alternative I that might consider is to use angle brackets <...> for tuples, so as to differentiate them from parenthetical statements, and thus allow for tuples within tuples (but they would still just be abstract representations for grouping, rather than actual datatypes):
Code: (a,b) = (c,d); // INVALID (use <...> to make a tuple)
<a,b> = <c, d>; // a=c; b=d;
<a,b> = <<x,y>,z>; // a=<x,y>; b=z;
<a,b,c> = <(x,y),z>; // <a,b,c> = <x,y,z>;
I prefer the first method (i.e. rule 2 up top) because I think it's a lot simpler, and because I cannot think of a scenario where it would even be useful to have a tuple within a tuple (i.e. no tuple-variables also means no "a = <x,y>" scenarios. You'd have to explicitly say "<<a,b>,c> = <<x,y>,z>" instead, which can always be reduced to "<a,b,c>= <x,y,z>"). This would mean that a 1-tuple is just a value, and a 2-tuple is just 2 values (e.g. ((a),(b,c)) is one value followed by two values, which is just three values or a 3-tuple).
The syntax I chose for functions is related to (and could be affected by) this issue:
Code: func Foo(byte x):byte { ... } // The syntax I preferred for single return values,
func Boo(byte x):(byte,byte) { ... } // ...but I hated it for multiple return values.
func Moo(byte x : byte,byte) { ... } // I found this syntax cleaner.
func(byte : byte,byte) fPtr; // Func-pointers must use same syntax,
func(func(byte:byte,byte),byte) g; // without "bleeding" into other lists
If I were to place the return values on the outside, then parenthesis would be required around them if they were multiple values (otherwise function pointer return-values could "bleed out" into other arguments or value-lists). However, if the idea is that parenthesis always surround "tuples", then perhaps it's not so bad -- though if I keep them "inside" with the arguments, then I don't need to require parenthesis (unless I go with the <...> syntax, since "func(:byte)" and "func(:<byte>)" would have slightly different semantics for operating on the returned byte; but both would return a raw byte).
Last edited by shkaboinka on 20 Mar 2012 11:16:04 am; edited 1 time in total |
|
| Back to top |
|
|
shkaboinka

Power User

Joined: 30 Jun 2010 Posts: 371 Location: Spokane, WA
|
Posted: 20 Mar 2012 12:09:04 am Post subject: |
|
|
I think I'd prefer to just use parenthesis to group tuples (lists of values), and I want to allow "a[b:c]" to be shorthand for "(a[b], a[b+1],...,a[c-1])". If anybody has any further opinions/questions/etc. on tuples/lists, I'd still like to hear them; otherwise here are the (next) two things I'd like to hear feedback on:
(1) Which function syntax to should be used? One thing to consider is how mixing in function-pointers and tuples will affect how "readable" a declaration is, depending on which setup is chosen:
Code: // Inner (current) setup:
func( ... : X) // returns one X-value
func( ... : X, X) // returns two X-values
func(A, func(B : C) : func( : func())) // (can you guess?)
// Outer setup:
func( ... ):X // returns one X-value
func( ... ):(X,X) // returns two X-values
func(A, fun(B) : C) : func() : func() // (can you guess)? (Answer: a function which takes an A and a [function which takes a B and returns a C] and returns a [function which takes nothing and returns a (function which takes and returns nothing)]).
(2) I REALLY need to discuss interpreted aspects (the $ operator). For functions and variables, using $ on the declaration/definition will mean that it is always "interpreted", whereas using $ in specific places only requires it to be "interpreted" that time. Flow-control constructs can also be interpreted if marked with $ (e.g. an interpreted "if-else" would attempt to predict the condition and then replace everything with JUST the if-code or JUST the else-code, and have no condition/jump at all. Loops would be "unrolled" directly). While some of this is clear, I am uncertain if it would be ok to sometimes have this result in actual runtime code (e.g. unrolling a loop resulting in several repeated statments instead of just some value -- even if the statements are simplified). ... I need to know what people might expect from this feature, or how they (you) might interpret such a mechanism.
THOUGHTS ON THESE PLEASE!!!
(remaining comments on tuples/lists as well, if there are any more) |
|
| Back to top |
|
|
elfprince13

OVER NINE THOUSAND!

Joined: 23 May 2005 Posts: 10232 Location: A galaxy far far away......
|
Posted: 20 Mar 2012 12:50:21 am Post subject: |
|
|
Nested tuples are actually quite nice in Python, and I use them in loops a good deal.
To distinguish them from parenthetical expressions, require them to contain commas.
i.e.
(a,) = ((b,))
but not
(a,) = ((b,),).
Good choice with the slicing shorthand. Will you support negative indices as well?
Also, I like the "outer" setup better for functions. _________________ StickFigure Graphic Productions || VSHI: Vermont Sustainable Heating Initiative
 |
|
| Back to top |
|
|
merthsoft
File Archiver

Joined: 09 May 2010 Posts: 2735
|
Posted: 20 Mar 2012 09:21:53 am Post subject: |
|
|
I agree with the liking the outer setup better. I'm a little unclear what all this $ business is, though.
As for negative indices, they're nice, but it might also be nice if we would declare the range on a list. Like, in C# you can do:
Code: using System;
public class Test {
public static void Main() {
var arr = Array.CreateInstance(typeof(int), new[] {10}, new[] {-5});
for (int i = -5; i < 5; i++) {
arr.SetValue(i*i, i);
}
for (int i = -5; i < 5; i++) {
Console.WriteLine(arr.GetValue(i));
}
}
}
And get:
| Quote: | 25
16
9
4
1
0
1
4
9
16 |
But if C# supported .NET's negative-indexed arrays, you could do something like arr[-3] = 10 or w/e. In Python arr[-3] would really be arr[arr.length-3]. _________________ Shaun |
|
| Back to top |
|
|
shkaboinka

Power User

Joined: 30 Jun 2010 Posts: 371 Location: Spokane, WA
|
Posted: 20 Mar 2012 12:17:26 pm Post subject: |
|
|
| elfprince13 wrote: | | Nested tuples are actually quite nice in Python ... To distinguish them from parenthetical expressions, require them to contain commas, e.g. (a,) = ((b,)) |
I'll bet they are quite nice in a dynamic interpreted langauge; but would this apply to a static compiled language? Perhaps if we had an example (in python), we could discuss how that would best translate, and if it's worth whatever mechanisms it would require for a static compiled language. (If so (still skeptical), the extra comma is not a bad approach).
| elfsoft wrote: | | I like the outer setup better for functions. |
Me too, actually. I went back and forth at least twice before, and ultimately settled on the "inner" setup because I felt that it would leave func-pointers more contained, and that the extra parenthesis made it messy. However, the "outer" setup is clearer, and it would be a rare case for a function to take a pointer to a tuple-returning-function as an argument (and in the case of returning something complex, it doesn't clutter up the inside). CONSIDER THIS DONE.
| merthprince wrote: | | Will there be support for negative [array] indices? |
If I allow them, then negative indexes will refer to whatever is before the array. Arrays will ALWAYS range from 0 to (length-1). My plan is to have no runtime bounds-checking, so the compiler will just trust that to the programmer (and I am debating whether it should generate an error for cases which can be determined at compile-time). However, I am open to discussion about a syntax for actually tying in a notation to make arrays prefixed with a size (hypothetical: "[]byte arr" has no built-in size, "[byte]byte arr" uses a byte for the size, and "[word]byte arr" uses a word ... or maybe ubyte and uword). Since string literals automatically have an extra zero appended to them, which I know is an assumption which might need addressing ("foo" becomes []char{'f','o','o',0}).
| merthsoft wrote: | | I'm a little unclear what all this $ business is |
The compiler will precompute whatever it can (e.g. so that you can use actual code to embed a computed value), but the $ operator REQUIRES the compiler be able to do this on something (and grants more liberties to do so, such as loop-unrolling). Here is an example:
Code: byte a = 5, b = a+7;
if(a+b < 15) { doA(); }
else { doB(); }
while(a < 100) { a += a; }
$while(b < 100) { b += b; }
return $(a+b);
// The compiler precomputes that code, resulting in THIS code:
byte a = 5; // b = 12, but is never DIRECTLY needed, so it is removed
doB(); // 5+12 is clearly not less than 15
while(a < 100) { a += a; } // The compiler leaves loop alone,
// ... unless TOLD ($) to: b=12+12 ...24+24 ...96+96 ... b is 192
return $(a+192); // ERROR! (value of a is unknown; cannot evaluate)
More specifically, "$x" would mean "give me the actual value of x (assuming that you know it)", regardless of how "x" was declared; but declaring "byte $x" has the same effect as putting the $ on each and every occurrence of that "x". The same goes for functions as well (e.g. inline it and require the result to resolve to a determinable value). One of the major points to debate for this case is whether "interpreting" a function can result in some runtime code being left in, even if it the resulting value can be determined at compile time. I'm leaning on having it be strict, since a simple function ought to be automatically inlined anyway, so I don't mind letting people just assume that a 1 or 2 statement function would be inlined (unless the compiler decides that it's better not to anyway). |
|
| Back to top |
|
|
shkaboinka

Power User

Joined: 30 Jun 2010 Posts: 371 Location: Spokane, WA
|
Posted: 23 Mar 2012 09:50:30 am Post subject: |
|
|
Tuples will not "exist":
All I am doing with "tuples" is taking an operation and giving it a name. For example, if "a+b" is a "sum", then a "tuple" is no more of a THING than a "sum" is. I am only providing tuples as a notational syntax for giving multiple values where only one would be expected (e.g. return (three,separate,values)). Thus, when I say that "(a,b) = (c,d)" is equivalent to "a=c, b=d", I mean they are fully semantically equivalent and not just computationally equivalent. It is merely the difference between saying "a and b are c and d, respectively" and saying "a is c and b is d" -- and thus ((A,B),C) is the same as (A,B,C) just as ((a+b)+c) is the same as (a+b+c) -- only you cannot write ((a+b)*c) without some parenthesis.
I'd like to keep calling them "tuples", since that is what they are; but if that is confusing for people (since some other languages have STRONG implications for what that means), then I can stop calling them anything -- just like how a+b is an addition, but it's not "AN" addition.
(This also means that (x,) would be a syntax error. If you want to physically store values in a list, then use an array; or use an array of pointers if you just want to refer to them from a physical list).
EDIT: Let's discuss syntax/semantics of multidimensional arrays though (I am considering whether to allow an [x,y] syntax as in C#, though [x][y] would result in the same thing if x and y are actual values; though on the otherhand, a "new [x][y]thing" would probably have different implications than a "new [x,y]thing") |
|
| Back to top |
|
|
merthsoft
File Archiver

Joined: 09 May 2010 Posts: 2735
|
Posted: 23 Mar 2012 10:08:29 am Post subject: |
|
|
In C# you can have both byte[,] array and byte[][] array, and they have different meanings--I think this is worthwhile. byte[,] is a multidimensional array, so byte[,] arr = new byte[3,3] is a 3x3 array. byte[][] is a jagged array, so byte[][] arr = new byte[3][] is a single-dimensional array that has three elements, each of which is a single-dimensional array of bytes. I'm sure you know all this, but it's good for anyone else reading through this that may not understand the differences. Also, it gets fun when you combine them:
Code: int[][,] jaggedArray4 = new int[3][,] {
new int[,] { {1,3}, {5,7} },
new int[,] { {0,2}, {4,6}, {8,10} },
new int[,] { {11,22}, {99,88}, {0,9} }
};
Here's a good resource discussing them:
http://stackoverflow.com/questions/597720/what-is-differences-between-multidimensional-array-and-array-of-arrays-in-c
I say we allow [,] syntax, but I also wish this language were more like C#, so maybe I'm jut biased.
EDIT: Though the performance issue is an interesting one. Multidimensional arrays are slower to access in C#, though your implementation could foreseeably be faster than C#'s. At the very least I find the syntax more attractive: var[5,6] = whatever just feels better when that's what you actually mean. It's also better when you start doing more dimensions: var[3,6,7,4,2] is better than var[3][6][7][4][2]. _________________ Shaun |
|
| Back to top |
|
|
shkaboinka

Power User

Joined: 30 Jun 2010 Posts: 371 Location: Spokane, WA
|
Posted: 23 Mar 2012 11:18:37 am Post subject: |
|
|
Thank you for posting those examples for everyone (I wasn't up to it yet)!
I'd like to consider something similar for OPIA, but there are some issues that would need addressing if I do: (1) To keep [,] rectangular (versus the jagged [][]), There would have to be a way to embed (or specify) size information (OPIA does not currently do this for native-compatibility reasons). (2) To avoid ambiguity, [x][y] should mean something different than [x,y]. ... Let's look at this step by step so we can decide on a good setup altogether:
Code: // Current setup:
[]byte // Points to a byte-array
[5]byte // IS a (static) byte-array (5 values)
[_]byte // Same as above, but size is determined from usage
*[5]byte // Same as []byte, but requires/assumes the underlying array is 5 bytes
[5][5]byte // An array of five [5]byte values (no pointers!)
[5][]byte // (static) Array of five []byte values (five pointers)
[][5]byte // Points to an array of five [5]byte values
[][]byte // Points to an array of pointers to arrays
I could deal with (issue 1) by providing a way to explicitly embed size information at the front of an array:
Code: // Hypothetical way to embed size-information:
[byte]byte // A (static) byte-array, prefixed by a byte to store the size
[word]byte // A (static) byte-array, prefixed by a word to store the size
*[byte]byte // Pointer to such an array
I could deal with (issue 2) by making every inner set of square brackets [...] become a pointer (e.g. make [a][b] ALWAYS "jagged" and [a,b] ALWAYS rectangular); and for symmetry, I could have the outermost [...] ONLY be a pointer if the * is used. It would look like this (I will just say "array" for "static array", and "pointer to array" otherwise):
Code: []byte // Array of (some determinable number of) bytes (previously indicated as [_]byte)
[5]byte // Array of 5 bytes
[byte]byte // Array of bytes (prefixed with a byte to indicate the size)
*[]byte // Pointer to an array of bytes
*[5]byte // Pointer to an array of 5 bytes
*[byte]byte // Pointer to a (byte-) size-prefixed array of bytes
[][]byte // Array of *[]byte values.
[5][5]byte // Array of five *[5]byte values.
[5,5]byte // Array of five [5]byte values (no pointers)
[5][5,5]byte // Array of five *[5,5]byte values
[5,5,5,5]byte // A 4-Dimensional array of bytes (no pointers)
[5,5][5,5]byte // 2D array of pointers to [5,5]byte values
This seems all well and good so far ([]T has become *[]T, but [5]*[5]T has also become just [5][5]T, and [5][5]T has become [5,5]T). However, since you cannot even COMPUTE a 2D index without at least knowing the inner-most dimensions of a multidimensional array (since it is all internally stored as a 1D array), you would have to specify (at least some of) the dimensions when working with pointers:
Code: [,]byte // Static 2D array (NOTE: the sizes have to be determinable!)
*[,]byte // ERROR! (cannot even COMPUTE where [x,y] is without the inner dimension)
*[,5]byte // Pointer to a [?,5]byte (ok: [x,y] is [5*x+y])
*[,byte]byte // (also ok, since [x,y] is [SIZE*x+y], and size is stored as a leading byte)
*[byte,]byte // ERROR! (It's the inner dimension that matters)
*[5,5]byte // Pointer to a [5,5]byte -- Very ok
I'd be ok with allowing a multidimensional array to be treated as an array with fewer dimensions:
Code: [3,4,5]byte arr;
*[3,4,5]byte p3 = &arr; // p3[x,y,z] refers to arr[x,y,z]
*[,5]byte p2 = &arr; // treat arr as a [12,5]byte (e.g. p2[4*x + y, z])
*[]byte p1 = &arr; // treat arr as a [60]byte (e.g. p1[5*4*x + 5*y + z])
// I'd like NOT to allow this (because then arr[x][y][z] == arr[x,y,z]):
arr[x] // gives the xth [4,5]byte value
arr[x,y] // gives the [x,y]th [5]byte value
Last edited by shkaboinka on 26 Mar 2012 08:20:17 am; edited 9 times in total |
|
| Back to top |
|
|
elfprince13

OVER NINE THOUSAND!

Joined: 23 May 2005 Posts: 10232 Location: A galaxy far far away......
|
|
| Back to top |
|
|
shkaboinka

Power User

Joined: 30 Jun 2010 Posts: 371 Location: Spokane, WA
|
Posted: 26 Mar 2012 08:54:25 am Post subject: |
|
|
Ok, here are some bottom lines for arrays that I want to stick with (you can forget what I said so far if this makes it simpler):
(1) The first part of any array (e.g. the [X] part of [X][Y][Z]T) will designate a static array if all of it's dimensions are given; otherwise it will designate a pointer to an array:
Code: [5][Y][Z]T // Static array of 5 [Y][Z]T values
[ ][Y][Z]T // Pointer to array of (some number N of) [Y][Z]T values
[5,5][Y][Z]T // Static array of 5x5 [Y][Z]T values
[ ,5][Y][Z]T // Pointer to an array of Nx5 [Y][Z]T values
*[5][Y][Z]T // Pointer to (because of *) array of 5 [Y][Z]T values
*[ ][Y][Z]T // Pointer to pointer to array of N [Y][Z]T values
(2) Any pattern of [X][Y][Z] should create a jagged array and [X,Y,Z] should create a rectangular array. This is done by making all "inner" [...] values be pointers -- REGARDLESS of what dimensions are given:
Code: [5][ ]T // Array of 5 (pointer to array of T) values
[5][5]T // Array of 5 (pointer to array of 5 T) values
(3) "Inner dimensions" (e.g. the x's in [,x]T or [,x,x]T) must be given explicitly (as numbers). I'd LIKE to allow types like [,]T, but it just is not feasible without either providing a clunky mechanism or adding assumptive overhead (which will fail to match native system structures). However, this CAN be done manually as in (4):
(4) Multidimensional arrays (not jagged arrays) can be treated as one dimensional arrays (since that is how they are actually stored). For completeness, I could allow other conversions as well (that is, if you can go from MxN to N, you should be able to go from N to MxN; thus if you can go from LxMxN to N to MxN, you might as well go from LxMxN to MxN, etc.):
Code: [L,M,N]T arr; // An LxMxN array of T values
[ ,M,N]T p3 = arr; // Pointer to an ?xMxN array (3D)
[ ,N]T p2 = arr; // Pointer to an ?xN array (2D)
[ ]T p1 = arr; // Pointer to an (N) array (1D)
p3[x,y,z] == p2[x*M+y, z] == p1[(x*M+y)*N+z] == arr[x,y,z]
|
|
| Back to top |
|
|
shkaboinka

Power User

Joined: 30 Jun 2010 Posts: 371 Location: Spokane, WA
|
Posted: 04 Apr 2012 11:42:55 am Post subject: |
|
|
I have updated the Overview to reflect my changes for arrays, tuples, functions, and default arguments.
As for interpreted stuff ($), I think that I want it to always be "deep" (recursive). That is, "$foo(...)" will cause ALL of foo to be interpreted, and "$while(...) { ... }" would cause everything in the loop to be interpreted (including inner constructs). This means that all variables declared within must be determinable, but external variables referenced may be affected at runtime.
The reasons for choosing a "deep" evaluation (versus just that "layer") are that (1) it's easier to mark a section, and (2) this is the more likely use anyway. I'd rather let single layers be optimized by the compiler rather than see people put $'s all over the place where they think that an optimization can occur ... but I do like to allow "Hey, interpret that whole thing ... I just didn't want to compute the values and embed them all by hand".
One other thing: I think I want to replace "const" values with a "final" modifier, on the grounds that a "final" value is actually embedded in the program as an (immutable) variable (e.g. embed "BIG_UGLY_STRING" as a final value, rather than once for EACH time it is used) -- as in Java. As for value-holders, you can use $ to indicate this anyway. How does that sound? |
|
| Back to top |
|
|
merthsoft
File Archiver

Joined: 09 May 2010 Posts: 2735
|
Posted: 04 Apr 2012 12:41:13 pm Post subject: |
|
|
Sounds good to me. I still don't fully get the interpreted thing, though. _________________ Shaun |
|
| Back to top |
|
|
shkaboinka

Power User

Joined: 30 Jun 2010 Posts: 371 Location: Spokane, WA
|
Posted: 04 Apr 2012 09:50:39 pm Post subject: |
|
|
| merthsoft wrote: | | Sounds good to me. I still don't fully get the interpreted thing, though. |
This is the best explanation of interpreted aspects that I have -- And I pose a GOOD POINT (at the end) that I'd like feedback on from anyone, as it will greatly affect one of the strongest implications of the language!
Ok, let me give you a good example: suppose I have a program that displays 3D wireframe objects, and I'd like it to include some demo objects, one of which is sphere (mapped off of a cube). The code might look something like this:
Code: // The points [face, row, col, xyz component]:
[6,5,5,3]byte pts; // can be treated as a [6*5*5, 3]byte
// Map one face:
for(a := 0; a < 5; a++) {
for(b := 0; b < 5; b++) {
x := (a-2)*50;
y := (b-2)*50;
z := 100;
dist := sqrt(x*x+y*y+z*z);
pts[0,a,b,0] = x/dist;
pts[0,a,b,1] = y/dist;
pts[0,a,b,2] = z/dist;
}
}
// Map other faces from the first:
for(a := 0; a < 5; a++) {
for(b := 0; b < 5; b++) {
pts[1,a,b,2] = pts[2,a,b,1] = pts[0,a,b,0];
pts[1,a,b,0] = pts[2,a,b,2] = pts[0,a,b,1];
pts[1,a,b,1] = pts[2,a,b,0] = pts[0,a,b,2];
pts[3,a,b,0] = pts[4,a,b,2] = pts[5,a,b,0] = -pts[0,a,b,0];
pts[3,a,b,1] = pts[4,a,b,0] = pts[5,a,b,1] = -pts[0,a,b,1];
pts[3,a,b,2] = pts[4,a,b,1] = pts[5,a,b,2] = -pts[0,a,b,2];
}
}
That code will compute the shape I described very nicely and uniformly. However, this also means that every time the program runs it must actually compute these values all over: go through all the loops, initialize all the values, etc. The optimal way to do this would be to compute all the points separately and embed them directly into the program so that it "has them" to begin with:
Code: [6,5,5,3]byte pts = {... EVERY...SINGLE...POINT... };
The problems with doing it manually are that it's tedious, error-prone, and a pain if you have to change anything about it. Thus it is often a "fair trade" to just have the program compute it all at runtime, even though it's a set computation.
(Note: there ARE instances where it is actually better to compute values at runtime, such as when you need to generate a lot of data, but embedding it directly would make the program larger than it would be to just have the CODE for it.)
What OPIA offers with the $ operator is that it tells the compiler to actually interpret the code and just embed just the result of the computations into the program. Essentially, you could write all the code to compute something, but marking it with $ makes it as if you computed it all by hand and just put the result in the program; only the compiler does it for you. The same code with the $ assertions would look like this:
Code: [6,5,5,3]byte pts;
$for(a ... // Insert all the same stuff as before,
... // including the inner loop and everything
}
$for(a ... // Insert all the same stuff as before,
... // including the inner loop and everything
}
...Actually, OPIA will do this ANYWAY: whenever and where-ever things can be predetermined at all. The $ operator is an assertion saying "this MUST happen here", and the compiler calls it an error if it cannot (for example, trying to use some value embedded in the OS which cannot possibly be known ahead of time). ... I am willing to debate this point (should OPIA automatically preinterpret code if not explicitly marked with the $ operator? Is there merit to leaving such computations in the program unless told to precompute?).
Note (again): If one were to allocate "new" memory and compute into it, the compiler would still have the program allocate new memory and compute into it -- though it might be able to convert a computation into statically embedded data (as my example did) and then just copy that over with a few instructions ... In fact, that might be a case where it would be good to NOT autocompute code (at least not loops) unless told to with $ -- OPINIONS??? |
|
| Back to top |
|
|
AHelper

LONG LIVE COMICTECH

Joined: 30 Jan 2011 Posts: 1658 Location: Aufhelperstan, Utopian Republic
|
Posted: 04 Apr 2012 10:16:09 pm Post subject: |
|
|
I think that OPIA should preinterpret code by default. I assume that you mean having a = 1+2; would have the 1+2 precomputed, as well as more complex things. Personally, I would say to preinterpret as much as possible to save code size by default and optionally turn off the feature in cases of stress testing, either globally or on a case-by-case basis.
ex, on your note, if you have it on by default, you could have something like !$ or such to explicitly disable precomputation.
If you are aiming to label this as an optimizing feature, calculate the size/speed of both and choose the best one. _________________ °ᴥ° Get Lucky
<BrandonW> "You don't even want to know what TI Connect does when it's just detecting your calculator...It ACTUALLY ERASES THE SWAP SECTOR on every communication attempt...EVERY SINGLE ATTEMPT...Yes, TI Connect will kill your calculator..What do I have to do to get your attention?!....Such a bloated protocol." |
|
| Back to top |
|
|
shkaboinka

Power User

Joined: 30 Jun 2010 Posts: 371 Location: Spokane, WA
|
Posted: 04 Apr 2012 11:56:41 pm Post subject: |
|
|
| AHelper wrote: | I think that OPIA should preinterpret code by default. I assume that you mean having a = 1+2; would have the 1+2 precomputed, as well as more complex things. Personally, I would say to preinterpret as much as possible to save code size by default and optionally turn off the feature in cases of stress testing, either globally or on a case-by-case basis.
ex, on your note, if you have it on by default, you could have something like !$ or such to explicitly disable precomputation.
If you are aiming to label this as an optimizing feature, calculate the size/speed of both and choose the best one. |
Expressions like "1+2" will be evaluated on the spot regardless (that goes without saying in almost every serious compiled language). And I don't see the harm in tracing values through either. The main point of debate is on flow-control constructs -- and perhaps at different levels (it's more of a no-brainer to cut out an "if" or an "else", or to turn a "while(1==1)" into an unconditional loop). For me, the point at which it become "controversial" is when nonlinear things (e.g. loops and recursive functions) get "rolled out" into their linear equivalents, because then you really do end up with larger code if there are runtime side-effects -- even if those side effects can be expressed as an initialization.
If I allow a "no optimize" feature, I think I'd prefer it to be a compiler option rather than something in code; but I'm thinking now that there may be merit in not unraveling loops unless explicitly stated, even if there are no runtime side-effects. For example, incrementing in a loop is computationally equivalent to one large addition; but if your purpose was to create a timed delay, then that "optimization" would ruin it.
I think what it comes to is that there are times when you want to NOT optimize, when you want to REQUIRE the optimization, and when you want to TRY to optimize (e.g. no error if it cannot). Some options for optimizing nonlinear code (e.g. loops) might look like this:
(1) Default=NOT, $=REQUIRE -- "Do or do not; there is no try."
(2) Default=NOT, $=TRY, ($$=REQUIRE?) -- "Whatever works, man."
(3) Default=TRY, $=REQUIRE, (!$=NOT?) -- "Compiler knows best."
Though I really don't like imagining code with an explicit "don't optimize this!" in it, I can get behind it in that you shouldn't/wouldn't do that unless you really freakin' have a purpose. Again, "NOT" only refers to nonlinear situations. Expressions, value-propagation, and if-else chains would still be optimized if they are knowable.
EDIT: Though (2) is "safe", it's a pansy approach. If you say $, then it better be required. I don't like (3) because you OUGHT to be able to write an intentional delay-loop without "!$" on it. I think I like (1) best because it doesn't take surprising liberties with your code, but also takes an explicit $ as more than a suggestion (e.g. using it at ALL is like putting a "THIS IS PRECOMPUTED CODE!" sticker on it). ... I still may have loops interpreted anyway if it's clear they'd not have runtime side-effects anyway. |
|
| Back to top |
|
|
shkaboinka

Power User

Joined: 30 Jun 2010 Posts: 371 Location: Spokane, WA
|
Posted: 07 Apr 2012 08:50:59 pm Post subject: |
|
|
THE TIME HAS COME! (the walrus said) ... I have reviewed all comments, topics, posts, etc. all the way back until before I switched the language to a "Go-ish" layout, and I finally feel that OPIA is fully (syntactically and semantically) defined and decided (or at least, as much as it can be before it is coded) -- which I wanted to do before I did anything huge with the compiler.
My plans (when time permits between school, work, moving, and my soon to be first-child) are as follows:
(1) lay out a careful plan for the compiler pipeline/design. This has already been 90% grasped in my head alone (after having poured over compiler books and articles and mental experiments "for fun"), but I have recently re-read most of the chapters in Modern Compiler Implementation In Java (it's the BEST!), and been charting out a careful comparison of my design versus what everyone else swears by -- and it turns out that I am not deviating terribly much. Expect an analysis and layout of the plan from me in the future (I have it mostly set).
(2) Update the Language Overview as much as I see fit (I don't want to be too picky with it, but I do want to make sure that every aspect is well documented so nothing falls through the cracks).
(3) Jump into the coding. I will keep everyone updated as that progresses. The exciting thing is that I will release it in modules, so that you can see everything up to the tokenizing & preprocessing (this is it's current state, though I may rework that a bit to be SLIGHTLY more modular), parsing/tree-building, etc. Every aspect of the pipeline will be explained, as in (1).
Anyone is welcome to ask questions about anything they think is lacking or confusing, and I am still open to considering some changes/additions; but I don't foresee anything that would change the language enough to put of coding it now.
Side note: as for interpreted aspects, I am going with "(1)" from the previous post: The default will be to precompute as much as possible without unraveling loops or recursive calls (unless the contents clearly have no runtime side-effects), and that the $ operator will be "deep"/recursive, causing a thing (and ALL of it's contents, except for references to externally defined entities) to be fully precomputed. |
|
| Back to top |
|
|
elfprince13

OVER NINE THOUSAND!

Joined: 23 May 2005 Posts: 10232 Location: A galaxy far far away......
|
|
| Back to top |
|
|
KermMartian

Site Admin

Joined: 14 Mar 2005 Posts: 55751 Location: Earth, Sol, Milky Way
|
Posted: 09 Apr 2012 01:07:42 am Post subject: |
|
|
| shkaboinka wrote: | | THE TIME HAS COME! (the walrus said) ... I have reviewed all comments, topics, posts, etc. all the way back until before I switched the language to a "Go-ish" layout, and I finally feel that OPIA is fully (syntactically and semantically) defined and decided (or at least, as much as it can be before it is coded) -- which I wanted to do before I did anything huge with the compiler. | I'm thrilled! Can't wait to see this in action.
| Quote: | | My plans (when time permits between school, work, moving, and my soon to be first-child) are as follows: | Don't forget technical proofing, Mr. TE!
| Quote: | | Side note: as for interpreted aspects, I am going with "(1)" from the previous post: The default will be to precompute as much as possible without unraveling loops or recursive calls (unless the contents clearly have no runtime side-effects), and that the $ operator will be "deep"/recursive, causing a thing (and ALL of it's contents, except for references to externally defined entities) to be fully precomputed. | Sounds good, thank you for clarifying that. _________________
 |
|
| Back to top |
|
|
shkaboinka

Power User

Joined: 30 Jun 2010 Posts: 371 Location: Spokane, WA
|
Posted: 09 Apr 2012 08:20:54 am Post subject: |
|
|
| KermMartian wrote: | Don't forget technical proofing, Mr. TE!  |
No worries there. My priorities are Family, School, Work, Technical Proofing, and then OPIA (... and then minecraft). |
|
| Back to top |
|
|
|
Register to Join the Conversation
Have your own thoughts to add to this or any other topic? Want to ask a question, offer a suggestion, share your own programs and projects, upload a file to the file archives, get help with calculator and computer programming, or simply chat with like-minded coders and tech and calculator enthusiasts via the site-wide AJAX SAX widget? Registration for a free Cemetech account only takes a minute.
» Go to Registration page
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
© Copyright 2000-2013 Cemetech & Kerm Martian :: Page Execution Time: 0.060809 seconds.
|