That's fair; I'm glad you at least put some consideration and justification into your design choice. Smile Are you using the green-black short trick to make the PC PSU function?
HOLY CRAP!! THE POWER JUST WENT OUT AGAIN! Thank God for batteries! Too bad they don't make battery powered soldering irons Smile. Anyway, the answer is yes. That switch in the back was put there by my father many years ago to make it work to test old motherboards. I had to twist some arms to get him to give it to me! (one condition is that I can't damage it. Thats why I didn't cut wires or drill holes in it!) I hate the power going down all the time. But the internet stayed up regardless. I guess the cable company still has power!
On the power supply, is it the orange one that is 5v, or the red one? I read on eHow that it was the red one. Should I measure it? or is it a standard?
Use one of the four-pin molex connectors, not the motherboard connect. Red is +5v, yellow is +12V, and the black wire closest to each is the respective ground for each.
OK Kerm. You know, at my local radioshack, they didn't carry them. They had 4 pin connectors, but they didn't fit. I'll see what I can do about that. (probably just get the two pin, slide it over the molex pin thingy, and solder to the pin. (MOLEX with no plastic connector thing Very Happy))
On a side note: Do they make a TOOL to cut and strip wires to the right length? If they don't, I wanna be the first to invent one! You don't realize how big of a hassle it is to strip a wire until you have to strip 1240 of them... But that's what I have to do. Just wondering.

UltDev: I need to have your name so I can include you in my acknowledgements. (Kerm, I already have your name. (Christopher "Kerm Martian" Mitchell))

Here is a picture of the bottom of the second board, with all the power circuitry wire soldered in.

Here is a picture of the top of the second board, with all the power circuitry wire soldered in.

Here is a picture of the bottom of the first board, with all the power circuitry wire soldered in.

This is the top of the first board, with all the power wires soldered in.

This is a picture of my back-up light source. (Cuz the power has been out over half the time for the last 2 days!)

Here I am soldering utilizing that back-up light source. I was cutting wires to size, because that's all I could do with no power.

This is my hand, placing a tiny wire into its right place.

This is me cutting and stripping the wire to size.

And finally, the one thing I really want is the power to stay constant. Without electricity, there is no heater!! AND ITS FREEZING!! (-8 degrees fahrenheit overnights! Below freezing indoors.)

Feb 7-
Its warm again! So, over the last few days, I have been getting lots of stuff done. I put the ICs in their sockets today. (I poked my thumb so hard, it bled Sad ) I met a new problem when doing this: the ICs are too close length wise. (Kerm, is THAT what you meant? Very Happy) So I solved this by cutting half a thousand wires, and soldering these tiny wires to the IC to extend its pins into the socket. That was pretty fun! Anyway, lots of work got done, and I am really close to it being done. (Sorry, I'm on the computer w/ no SDHC, so no pics yet.) Also, could someone check this math.

Plz check my math here.
My design has 112 command gates.
The Intel i7 has 781,000,000 transistors.
Divide 781,000,000 by 2 to get the gate count. (most gates have 2 transistors. Exception is NOT and exclusive gates, which pretty much cancel each other out. (1 vs. 4...))
The i7 has approximately 390,500,000 gates.
The i7 is 32 bit, so take my 8 bit and multiply by 4 to get 448 gates for 32 bit.
It is a known fact that my CPU is 5 MIPS (40 Mhz clock, but 8 clock cycles per common command (exception for math commands which are 1))
The i7 has 147,600 MIPS.
So, I took the i7 gate count, divided by my gate count and got 871651 as a scale factor
Finally, the answer I get with that factor sounds so skew, I don't know what went wrong. (I mean, the i7 uses 871,651 times as many gates, but there is NO way my CPU could be 871651 times faster if I had the same 45 nm technology to build my design 871651 times! (Because that number is like 50 times faster than the fastest CPU on the market!!!)
adept wrote:
Finally, the answer I get with that factor sounds so skew, I don't know what went wrong. (I mean, the i7 uses 871,651 times as many gates, but there is NO way my CPU could be 871651 times faster if I had the same 45 nm technology to build my design 871651 times! (Because that number is like 50 times faster than the fastest CPU on the market!!!)
Yes, your math is way off. My estimate follows.

Problems
adept wrote:
Divide 781,000,000 by 2 to get the gate count. (most gates have 2 transistors. Exception is NOT and exclusive gates, which pretty much cancel each other out. (1 vs. 4...))
The i7 has approximately 390,500,000 gates.
In the case of most modern commercial ICs, the gates are implemented in hardware with CMOS logic. An inverter requires two transistors (one each NMOS and PMOS), a two-input NAND gate requires four (two of each), and such, so your gate count is way off already.

adept wrote:
The i7 is 32 bit, so take my 8 bit and multiply by 4 to get 448 gates for 32 bit.
Again a faulty assumption, as it's not quite a linear scale with word width to gate count, never mind that the i7, as an x86-64 core, uses a 64-bit word and in some cases even larger (ie SSE operations working on 128-bit registers).
Especially in the case of the i7's arithmetic logic, it's hugely more complex than yours, as it has a FPU, hardware dividers and multipliers, surely uses carry-lookahead adders (rather than the ripple-carry I assume you're using), etc.

adept wrote:
The i7 has 147,600 MIPS.
Not sure where you got that number, but it's definitely not millions of instructions per second. It's likely Dhrystone or Whetstone MIPS, which are an entirely different beast.
A more accurate estimate: 5 instructions per cycle*2.8 GHz (number again from the i7-860) = ~14 billion instructions per core per second * 4 cores = ~56 billion instructions per second. That number will never be attained in practice due mainly to memory latency, but it's OK for these purposes.

A better estimate
As you had noted/guessed in chat, much of the gate count for modern x86 processors comes from caches. In the case of my i7-860 (a Lynnfield core with about 770 million transistors), there's 4x64 KB of L1, 4x256 KB of L2, and 8 MB of L3 cache, for a total of 9.25 MB of cache. IIRC, each bit of cache on a Lynnfield i7 requires six transistors, so 6*8*9.25 MiB = ~465 million transistors, leaving us with 'only' 300 million transistors or so working as other logic.

Of course, there's a lot more than just cores and cache on the i7 die. Notably, the memory and PCI-e controllers. I don't have any sort of hard numbers for these, so we'll have a look at a die shot and estimate from there:

Each large block of L3 in that image seems to be 2 MB, so around 100 million transistors in each of those areas. Those parts of the die are probably highest-density of anywhere, let's say a similar area of die in the rest of the processor will hold around 50 million transistors- if we tile a space about the size of one of those L3 blocks across the whole die, I estimate the total count at an average density of 50 million transistors per block-size to be around 850 million, so it's not too far off.

Each core is about 1.75 times the area of one of those 2MB L3 blocks, so that yields 70*4=280 million transistors. Subtract the cache from each of those areas (each core has its own L1 and L2 caches for a total of 320 KB=~15 million transistors) and we get 220 million. I'm still convinced that value is much too high, as cores should have proportionally fewer transistors per unit space due to more intricate interconnects between subparts- we'll say about 150 million transistors in the core logic.

Conclusion
adept wrote:
So, I took the i7 gate count, divided by my gate count and got 871651 as a scale factor
We've established how off this is, so recalculating:
Assume an average of 4 transistors per gate in both machines. Mostly a wild guess. This gives us about 37 million gates in Lynnfield's core logic against your 112 (mind the huge complexity difference).

If we recall the ~56 billion instructions per second Lynnfield is capable of at 2.8 GHz, the ratios are suddenly much less favorable to you:
Lynnfield at 2.8 GHz is 56 billion / 5 million = ~11 thousand times faster than you, per-die.

Other notes
adept wrote:
there is NO way my CPU could be 871651 times faster if I had the same 45 nm technology to build my design 871651 times!
Quite correct, as aside from bad estimates I've pointed out here, you'd need a lot more logic to coordinate those 870k cores to do anything useful as a unit- some sort of obscenely huge bus arbiter at the least, which would likely eat at least half your die space. Amdahl's Law throws another wrench in anything with that many cores, and you'd have extreme difficulty getting a task to scale across that many execution units.


Ye gads, that was a long post. I did rather enjoy writing it, though.
Gosh! That was WAY! more thought than I put into it. Anyway, I doubt that their CPU could even APPROACH 56 billion instructions per second. Especially when the measurement I saw on the internet was WAY smaller. Still, your die picture illustrated very well how far off my transistor count was. I'll use the thought you put into this one for other CPUs for comparison. (For example, I would like to compare to the Intel 4004, AMD Phenom II X6, and other CPUs to get the best results.)
Quote:
carry-lookahead adders (rather than the ripple-carry I assume you're using)

Dang! How did you know I just Googled "adder" and used the first thing that came up? You know, I have never heard of these "look-ahead adders". Judging by the name, they predict things somehow. It's all cold-fusion to me!
Quote:
A better estimate
As you had noted/guessed in chat, much of the gate count for modern x86 processors comes from caches. In the case of my i7-860 (a Lynnfield core with about 770 million transistors), there's 4x64 KB of L1, 4x256 KB of L2, and 8 MB of L3 cache, for a total of 9.25 MB of cache. IIRC, each bit of cache on a Lynnfield i7 requires six transistors, so 6*8*9.25 MiB = ~465 million transistors, leaving us with 'only' 300 million transistors or so working as other logic.

How do you get a 6 transistor gate?! I wanna know how! (Mine takes 8 for the record.)
So this whole amdahl's law things is really cool. He came up with the whole concept of diminishing return. I didn't improve everything, so the improvement is smaller in comparison that the improvement would have been in a simpler system. (i.e. even if you multiply my system by a trillion, the return of adding more of my CPU is less and less. Eventually reaching the point of being insignificant.) So that guy is saying that unless my system is designed big, no amount of scaling can make it the same or better than something designed big. (I know that probably didn't make sense Very Happy)
On a side note, how in the world did you know all that stuff? (I should remind you I had NO idea how computers worked before I started this project back in december, and never took any classes or anything.) You really impressed me. Also, I am disappointed my design failed, even here in the theoretical stage. It is possible for me to clock this at a higher rate with different ICs. But that still probably couldn't make up for the i7's superior design team. There is more in your post I'm curious about, but the bulk of it has been said.
So Tari, thanks a whole lot for the help. You solved my math woes. Thanks a million times!
adept wrote:
Gosh! That was WAY! more thought than I put into it. Anyway, I doubt that their CPU could even APPROACH 56 billion instructions per second. Especially when the measurement I saw on the internet was WAY smaller
It seems pretty accurate to me, having just tested it on hardware. I compiled the following program with gcc and ran it on my i7 with the help of cygwin:

Code:
#include <time.h>
#include <stdio.h>

int main(int argc, char **argv) {
    clock_t start = clock();

    __asm__(" movl $0xFFFFFFFF, %eax");
    __asm__(" mov $0, %ecx");
    __asm__("decrement:");
    __asm__(" decl %eax");
    __asm__(" cmp %eax, %ecx");
    __asm__(" jne decrement");

    clock_t stop = clock();
    printf("CLOCKS_PER_SEC=%i\nCompleted in %i clocks.\n", CLOCKS_PER_SEC, stop - start);
}

Resulting in..
Tari's terminal wrote:
Tari@Kerwin $ gcc -O0 -g -o counter counter.c && ./counter.exe
CLOCKS_PER_SEC=1000
Completed in 2620 clocks.

Meaning my processor counted from 2^32-1 (roughly 4.3 billion) down to zero in just over 2.6 seconds. The loop body is three instructions, so that single core hit nearly 5 billion instructions per second in a loop which didn't get anywhere near full processor utilization (that loop probably incurred large delays from the jump and lack of other instructions to interleave while waiting on fetch). I'd guess the core was waiting on fetch about half the time, and that was only one of four cores running, so that simple estimate puts it at 40 billion instructions per second.

What I'm saying is, my number was accurate, and that's truly astounding when you get to thinking about it. Smile

adept wrote:
The Tari wrote:
A better estimate
As you had noted/guessed in chat, much of the gate count for modern x86 processors comes from caches. In the case of my i7-860 (a Lynnfield core with about 770 million transistors), there's 4x64 KB of L1, 4x256 KB of L2, and 8 MB of L3 cache, for a total of 9.25 MB of cache. IIRC, each bit of cache on a Lynnfield i7 requires six transistors, so 6*8*9.25 MiB = ~465 million transistors, leaving us with 'only' 300 million transistors or so working as other logic.
How do you get a 6 transistor gate?! I wanna know how! (Mine takes 8 for the record.)
It's a cell of SRAM, not so much a gate. Wikipedia knows more.

adept wrote:
So this whole amdahl's law things is really cool. He came up with the whole concept of diminishing return.
I might not say quite that much, but Amdahl's law definitely codifies the difficulty with multicore programming. The concept of diminishing returns certainly predates Amdahl, anyway.

adept wrote:
On a side note, how in the world did you know all that stuff? (I should remind you I had NO idea how computers worked before I started this project back in december, and never took any classes or anything.) You really impressed me.
Quoth the 'about me' bit on my web site:
taricorp.net/about wrote:
As of the fall of 2009, I am a student at Michigan Technological University, studying for a BS in Computer Engineering.
It's the result of years of study on my own and some good formal education (mostly just devouring information when I find it, really).

It's worth noting that I wrote a research paper on the topics of microprocessing towards the end of my high school career, which you can read over on scribd (note that there are probably a few parts which are inaccurate or have minor grammatical errors/typos).

[aside: this isn't the first time someone's been intrigued by my seeming in-depth knowledge of low-level hardware. The previous time was when I got to talking about die yields from a wafer on IRC somewhere and that prompted someone else who reportedly had worked with that sort of hardware (for TI, no less) to wonder how I knew what I do. Answer: just a very interested lay-person.]

adept wrote:
Also, I am disappointed my design failed, even here in the theoretical stage.
Hey, I wouldn't say that. As long as the thing runs when you get it built, I count that as a win. Even if you don't, it looks like this was a great learning project for you.
Saying it's a failure simply because it's a rather inefficient little processor is a bit like saying you're a failure at swimming because you couldn't beat a world record.

adept wrote:
So Tari, thanks a whole lot for the help. You solved my math woes. Thanks a million times!
Glad to be of service, and I actually had some fun writing out these huge posts tonight. Very Happy
*nudge* A couple errors I noticed in my first post from yesterday.

The Tari wrote:
If we recall the ~56 billion instructions per second Lynnfield is capable of at 2.8 GHz, the ratios are suddenly much less favorable to you:
Lynnfield at 2.8 GHz is 56 billion / 5 = ~10 billion times faster than you, per-die.
Looks like I got a bit excited there. Your processor gets 5 MIPS, or 5 million instructions per second, so the scale factor is 56 billion / 5 million = ~ 11 thousand. I've updated my post to reflect this figure.

The Tari wrote:
a two-input AND gate requires four transistors
That's for a NAND gate, not AND, which would require six. Fixed in the post as well.
Quote:
Feb 7-
Its warm again! So, over the last few days, I have been getting lots of stuff done. I put the ICs in their sockets today. (I poked my thumb so hard, it bled Sad ) I met a new problem when doing this: the ICs are too close length wise. (Kerm, is THAT what you meant?
I meant that you should have space between the sockets to run inter-column wires. You mean your sockets and your ICs have different numbers of pins?
Quote:
So I solved this by cutting half a thousand wires, and soldering these tiny wires to the IC to extend its pins into the socket. That was pretty fun!
And also a bad idea, because the heat of soldering can destroy ICs; that's one of the points of using sockets, besides replaceability. Luckily, 74-series ICs are hardier than some.
Quote:
Anyway, lots of work got done, and I am really close to it being done. (Sorry, I'm on the computer w/ no SDHC, so no pics yet.) Also, could someone check this math.
Extra props for hacking by candlelight.
Quote:
I meant that you should have space between the sockets to run inter-column wires. You mean your sockets and your ICs have different numbers of pins?

To clarify that whole problem, I want you to use your imagination. (No time for pictures.)
Imagine all the sockets are touching. (On the sides with no pins) Now imagine you have an IC. So, you go to plug those into those sockets, and lo-and-behold the IC extends OVER the sides with no pins. So now your screwed. So you ask yourself: what in the world am I going to do? You sit and think. You don't have time, solder, or wick to desolder every single one of those suckers and put them on a different board, which you also don't have. You could put components on the foil side and solder on the plastic side. Obvious problems there. You could also cast a magic secret voodoo spell, but you don't have time to learn the incantations. But wait, you still have one more option. You could always solder itty-bitty wire bits, and go 3-d in the up direction. Extend those pins. It has its disadvantages, mainly cutting half a thousand wires, soldering half a thousand wires, heating the ICs, and more. It just was the easiest option without resorting to extremely time consuming methods. And yes, thank goodness 74-series chips are tough!
@ Tari
OK, you clarified my questions about your post. It figures as much. The only hope I have now of justifying why I built this is either claim ignorance to what you've said and go with my math, or modify my purpose to something business oriented, and hope the judges are business men, not scientists. (There's better options. I'm just kidding). I'll be thinking about what I can tell the judges to get them to think my project is the winner.

Also, for my display board, which I have started, I took these LEDs I had, and lit up the title (Bit by Bit). It looks really cool. (Just a side note, not really a project.)

Just wanted to add something for Tari:
I read your paper. It's really exciting to see how much knowledge we have in common. It is obvious that you know a lot more about x86 than I do. I have 14 commands, an x86 has something over 200 commands. I really think that's excessive, but what ever floats your boat. I have just one question about your paper: Did you ever actually build one?

Double Edit!!:
Amdahl's law is like a machine shop with a tire. (A CPU with a task.) Even if you hire 10 workers to work on that tire, (10 CPU's on one task.) the task will still take the same amount of time.
I hope you did something like only raised up every other IC...? I should note that I think many of our older members have a lot of x86 knowledge from years of writing low-level C, x86 ASM (my first language other than LOGO, TI-BASIC, QBASIC, and HTML!), and taking all kinds of computer architecture classes. Smile I'm a little confused by your x86 comments at Tari.
Yes, I'll post a picture soon. (I'm too lazy to get up and do it now.) And I was talking about Tari's document he referenced me too. His CPU design focused on a CPU that was x86 compatible, as mine is not. I knew I would never have time, experience, or knowledge in the defined time to aquire experience and knowledge, so didn't worry about x86 compatibility. That's what I meant.
Now, the reason for this post is to ask if there is ANYTHING you guys want to tell me before tomorrow, which is the due date for the paperwork. After tomorrow, there IS not turning back, and everything will be finalized. I'll check again tomorrow morning (hopefully) and check this forum for any comments and suggestions on what I can do before this is finalized.
adept wrote:
And I was talking about Tari's document he referenced me too. His CPU design focused on a CPU that was x86 compatible, as mine is not. I knew I would never have time, experience, or knowledge in the defined time to aquire experience and knowledge, so didn't worry about x86 compatibility. That's what I meant.
My 'SMP' design is far from x86-compatible. There's the more immediately obvious difference that it's only an 8-bit core (with 24-bit instruction words, no less), but it's an entirely original ISA with a Harvard memory model. Quick comparison of the register sets:

x86: four general-purpose registers (32-bit), a handful of segment registers (6 at 32 bits each), and a couple index registers, as well as an FPU with eight 80-bit registers and another eight SSE registers at 128 bits wide each.
SMP: four GPRs (A-D/r0-3) plus accumulator, r/w sink (similar to MIPS' $zero), some memory access registers (IAR and PMB), and a few GPIO buffers, all of which are 8 bits wide.

As far as the general capabilities, go, my SMP is much closer to a PIC14 core than x86, although with some feature ideas taken from ARM.
Adept, ah, gotcha. Tari, impressive design, thanks for sharing!
I've managed to lose my camera, and can't put the photos up. (It'll probably turn up eventually.) Anyway, over the last week or two, I've been really super busy because of band large group, but still made progress. On the interface, I made some ribbon cables to connect it to the CPU board. The input (which is just push-buttons), has all the input pins on the buttons tied to V++, 12 of the outputs connected to the collector of 12 transistors, the 13th button connected to the base pins on all of those, resulting in 12 buttons additionally switched by a single "Go" button. (Whew!)
On the main board, I've been checking dozens upon hundreds of thousands of connections. They ALL have to work perfectly the first time. Then, on top of that, I have been thinking on how to mount it securely. The wires formed this large black weave over every chip. (Screws me if one gets damaged Sad ), that obstruct the mounting holes. Also, they tower above the board, about 1 1/2 inches. I can't find a darn standoff that big with a screw small enough to fit in the hole.
On the display board, I still have to solder the 113 LED lights. If even ONE of them doesn't light properly, It will reflect extremely poorly on my ability as an electrical engineer. So that has to be perfect. (Also, hopefully it will attract people to my project, like moths to a lightbulb.) Then, I have to glue on the revised results (still.) from Tari's comments.
Then, because I win by default for the regional (no other high-schooler entered in my category...) I want to revise my schematics to make them look better. This includes some color-coding, writing the 3rd page on the computer, not by hand, and stuff like that. More revisions will be made for the International competition. (should I win state.) Also, I think it would be neat to take it to the maker faire. There is plenty of documentation from the very beginning to the end of the project. People could duplicate my results easily, esp. if I finish the PCB's I started to make back in January.
Oh, and I just wanted to point out this site I found last night (http://www.homebrewcpu.com). It's almost the same thing I did, but executed better with more professional style.

P.S. Since I now understand what makes computers work, can program in my own op-codes, and stuff; I'm going to start programming in Z80 assembly for real this time. (not just crappy "hello world" stuff that I could do before.) For my UltCalc project (on the table next to my CPU BTW, waiting only to be reassembled), I'd like to make something like DoorsCS, just stripped of extra features to save space. Also, make a mouse driver for the touchpad and make some games and stuff.
For a mounting, I'd consider a custom-built thing. With some lumber and the assistance of someone used to woodworking, you could probably build a small box with open top and bottom, as below:

Just make the lip such that you can rest the edges of the board on it, and build some piece to go over the top of it all and hold the board in (another square frame to fit inside but not cover the board, or I like the idea of a piece of acrylic to form a complete top of the box). With an easily-removed top piece, it's also very easy to remove the board for whatever reason, if needed.
@Tari:
That's not a bad idea at all... The only problem is that plastic acrylic is expensive and not available at places like Hobby Lobby. (i.e. local stores.) It might be possible with wood trim... I'll look more into that idea.
BTW: how the heck did you find time to Google sketch up your idea? Razz (PS Was that Sketch-Up?)
I don't think you need a lot of acrylic. I picked up a small sheet for a project for less than $2 at the hardware store, and I doubt you would need more than a few of that size. No idea how easy such a thing would actually be to find, though.

adept wrote:
BTW: how the heck did you find time to Google sketch up your idea? Razz (PS Was that Sketch-Up?)
It was the work of a couple minutes in NX, as I had the program handy and it's the one I know best. As for why, I figured it would be much easier to illustrate what I mean with a simple 3d model rather than try to describe the whole thing.
The Tari wrote:

adept wrote:
Also, I am disappointed my design failed, even here in the theoretical stage.
Hey, I wouldn't say that. As long as the thing runs when you get it built, I count that as a win. Even if you don't, it looks like this was a great learning project for you.
Saying it's a failure simply because it's a rather inefficient little processor is a bit like saying you're a failure at swimming because you couldn't beat a world record.


Well, Intel has huge computer systems optimizing their chip designs in order to squeeze every drop of performance and space out of them. I'd be surprised if anyone without a lot of time to think or a graduate degree in EE could beat them. Even then...

As for the diminishing returns thing, it's technically true, but there's still something to be said for massively increasing the number of cores. Watson is good evidence of that. Better evidence, I think, is that of the organic brain. Most humans can outperform any supercomputer currently in existence, partially as a result of the sheer number of neurons in the brain. Of course, more of that power is the result of fancy signal processing by the brain, but who am I to argue with Binary?
Regarding the Doors CS comment about cloning it with many fewer features, why not do something original and write some CALCnet games, or other programs and games, rather than reinventing the wheel? Smile
  
Register to Join the Conversation
Have your own thoughts to add to this or any other topic? Want to ask a question, offer a suggestion, share your own programs and projects, upload a file to the file archives, get help with calculator and computer programming, or simply chat with like-minded coders and tech and calculator enthusiasts via the site-wide AJAX SAX widget? Registration for a free Cemetech account only takes a minute.

» Go to Registration page
Page 4 of 6
» All times are GMT - 5 Hours
 
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum

 

Advertisement