I decided to go through my logs today and create some general stats on the things we've said over the past year (almost, 11 months). These are just my own, personal logs, so there were a few downtimes, but nothing more than a week or two, I believe. I still think that they are pretty accurate, though.

I wrote a Perl script (Which I can share if you want, but it's very tailored to my own log format) to do all of these. And I'll go over what each one of these means, in case it's not clear.

Also, I just remembered about the Laughing Out Loud filter, so anywhere you see a 0x5, mentally replace it with l o l, since that's what actually shows up in the logs. (Except for one place, and I've made a note there)


Code:
Average response time: 53.1253440333749s, 0.885422400556248m, 0.0147570400092708h, 0.00061487666705295d
Average messages per day: 1624.32941176471/day

We average about 53 seconds in between each person saying something, and get a total of ~1624 lines per day.


Code:
Most used messages:
[2267] >> lol
[1375] >> :p
[1346] >> haha
[1209] >> ah
[1202] >> ok
[1198] >> heh

# OLD
Most used messages:
[2261] >> lol
[1374] >> :p
[1345] >> haha
[1209] >> ah
[1202] >> ok
[1198] >> heh

These are the lines that people have said on their own. In other words, there have been 2261 times that someone has said "lol" on its own line, 1374 times for "Razz", and so on.


Code:
The users that spoke the most are:
[70542] >> kermm
[25835] >> ashbad
[21495] >> catherine
[21427] >> aes_sedia5
[21194] >> merth
[18725] >> forty-two

# OLD
The users that spoke the most are:
[54004] >> kermm
[25421] >> ashbad
[21117] >> catherine
[18215] >> aes_
[14753] >> forty-two
[14231] >> qazz42

Pretty self-explanatory. Kerm's said 54004 lines, Ashbad: 25421, Me: 21117, etc.


Code:
The most highlighted users are:
[7238] >> kermm
[3567] >> http
[2667] >> merth
[2560] >> well
[2349] >> catherine
[2303] >> benryves

# OLD
The most highlighted users are:
[4493] >> kermm
[3567] >> http
[2556] >> well
[2303] >> benryves
[2287] >> catherine
[2144] >> oh

This one is meant to count the number of times people have highlighted another. A highlight counts as (And this is from my code):

Code:
$msg =~ /^([^ ]+)[:,]/

Which means that any time someone says "KermM, [. . .]" or "Benryves: [. . .]" it will match. It also means that any time someone does "http://cemetech.net" it gets matched (I could take that out, but it's kinda interesting). And the number in brackets in the number of times it's been said before, like the others.


Code:
The SAX users that spoke the most are:
[24929] >> ashbad
[2998] >> sarah
[2947] >> zeldaking
[2374] >> aetios
[1921] >> lincolnb
[1716] >> willrandship

# OLD
The SAX users that spoke the most are:
[24924] >> ashbad
[2998] >> sarah
[2947] >> zeldaking
[2374] >> aetios
[1921] >> lincolnb
[1716] >> willrandship

I felt the SAX users were under-appreciated, so I did a check for them, too. These are the number of lines each user from SAX has said.


Code:
The most-used SAX phrases are:
[487] >> hi
[369] >> 0x5  (This one is actually 0x5)
[363] >> :p
[211] >> ok
[195] >> :)
[167] >> oh

These are also from SAX. They're similar to the first set, at the top of this post.

Edit:
I checked for runs, smileys, and the most-used words overall. I also added Merth's DecBot links to the program, so now all of the variants of "Kerm" contribute to his count. I left the old numbers in there so that everything still makes sense, but the ones above them are more accurate.

Code:
The short-runners are:
[205] >> kermm
[55] >> geekboy
[52] >> ashbad
[42] >> nikky
[40] >> forty-two
[35] >> comicidiot

The medium-runners are:
[266] >> kermm
[224] >> ashbad
[175] >> doorscs
[101] >> comicidiot
[86] >> ahelper
[84] >> forty-two

The long-runners are:
[72] >> doorscs
[25] >> ashbad
[20] >> ahelper
[16] >> forty-two
[16] >> kermm
[15] >> dtal

A "run" is defined as being when the same user says something 4 times in a row without being interrupted during a period of time. For short runs, it's 1 minute; medium - 3 minutes; long - 5 minutes. And, runs aren't counted twice, so if you say 5 things, it still counts as 1 run.


Code:
The average line-lengths for short runs are:
[7.98819301848049] >> kermm
[6.576] >> geekboy
[10.5921926910299] >> ashbad
[6.14189189189189] >> nikky
[5.41071428571429] >> forty-two
[6.18262411347518] >> comicidiot

Kerm wanted me to see how long the lines were for people who did runs (Since he did them more than Geekboy). This is the number of characters on average.


Code:
The highest line-lengths for runs are:
[108.5] >> jared__
[33.5] >> brandonw|
[27.8888888888889] >> iambian
[25.25] >> alberth
[25.125] >> valberth
[24.5] >> jet322

Since I already had the run data for line lengths, I figured it'd be interesting to see what the longest line lengths were for it. This one is also the number of characters on average.


Code:
The most-used smileys are:
[11395] >>  :p
[11203] >>  :)
[6468] >>  :d
[4870] >>  :(
[2616] >>  :/
[2516] >>  ;)

This one just counts the number of words that are actually smileys. I have some terrible regular expression that matches as many smileys as I could think of, including backwards (To get all those fiends that do smileys like "(:").


Code:
The smiley-users are:
[8204] >> kermm
[4163] >> catherine
[3769] >> benryves
[2987] >> ashbad
[2805] >> tifreak8x
[2097] >> alberthro

These are the people that used smileys the most. I just realized that it counts people multiple times if they say "something like this Very Happy Very Happy Very Happy Smile :/".


Code:
The most-used words overall are:
[95352] >> i
[88523] >> the
[68434] >> to
[63401] >> a
[47308] >> it
[42311] >> you

No surprises here. This just splits all the words we've ever used and counts their occurences.
EndEdit


Observations:
:: Ashbad says a freaking TON of things. In total, he's said 50345 lines. That's only 3659 lines less than Kerm has said from IRC, and 4916 lines less than Kerm's said total. Ashbad accounts for 9% of the lines for #cemetech.

:: Originally, before I took out all the "USER has entered the room." or "USER has logged in" lines, Souvik ranked up at the top. "(C) *Souvik has entered the room." showed up 2717 times, 456 more times than "lol".

:: Kerm does a lot in this channel. That's to be expected, obviously, but it's neat to see the data prove it, too.

:: Thank GOD "xD" didn't show up anywhere near the top. I'd have to stab something if it ranked #1, or even #6. I'm not too happy about "lol" being up there, but I don't suppose there's much to do about that after The Rise and Fall of L O L (And, I'm guilty of using it, too).

:: I am kind of suprised about the smileys being ranked so highly. "Razz" being #2 and #3, and "Smile" being #5 in the lists. I expected them both to be around #6.

:: I also think it's interesting that the people who spoke the most tended to be intelligent. Or, at least, Kerm says high-quality things a lot; Ashbad also does, for the most part. I like to think that I do, but I know a lot of it can just be rambling. And then there's Aes, Seana, and Qazz. Aes has been missing for a couple weeks/months. Seana says random stuff all the time (<3). And Qazz... is Qazz. I find it weird that there's this divide in that sense.

:: I added a quick check to see how many unique messages there are in the channel: 372229. ~66%, or 2/3 of our channel is completely unique in that sense! The other 1/3..? I guess a lot of it has to do with "lol" and "Razz", along with the ones further down the list (I only listed the top 6 for each category, but I assume that 7-15 are going to be high numbers, too).

Edit:
:: I like smileys. I find it kind of funny that I used a little of half as many smileys as Kerm used. Also, about 20% of the things I've said have at least one smiley in them.

:: Another interesting thing is that Kerm does more runs than Geekboy. Almost 4 times as many. He also only uses ~1.4 characters more than Geekboy does, on average.

:: Who is this "jared__" guy? Whoever he is, he must have been telling a story (perhaps to himself!)

:: I am very, very happy to see that "XD" or any variant isn't in the top 6 for smiley usage. I was pretty worried it would be #6, but we only seem to use colon-smileys.

:: Merth jumped up to #3 on the Most Highlighted Users list! He was right about being about half-and-half when it comes to which nickname he used ("Shaun" or "Merth").

:: The number of lines that have "Razz" in them make up 3% of our unique messages. 7% of our total messages have at least one smiley in them (from the top 6 list).

:: The most-used words overall are still the most-used when you include punctuation words. I originally had some regex in there that would get rid of the punctuation, but when I removed that line, the rankings stayed the same:

Code:
s/[^a-z]//g


:: I also got the regex wrong for smileys the first time, and found out that Aes used a smiley "<<<.>>>" one time when his nickname was "Aes_santa".
EndEdit

What does everyone else think about this? If you want me to add any other tests or post the source code, reply and I will. I find this sort of thing really fascinating, and I'm sure several of you do too. I also want to try to make these numbers as correct as possible, so if you think I might have messed up on my calculations, please tell me.
*Just realized he talks to much*

I have been gone for like 3 months on a hiatus, I really dont know why. and I am STILL ranked 4. =O CRAZY. I would say I would talk less, but I am just starting to come back. haha. So hey, maybe I will move up in the ranks!
It would be interesting o see the counts of people speaking with linked names. For example, I'm probably half and half between "shaun" and "merth". I can get you a dump of decbot's links if you want.
merthsoft wrote:
It would be interesting o see the counts of people speaking with linked names. For example, I'm probably half and half between "shaun" and "merth". I can get you a dump of decbot's links if you want.

I'm interested in seeing that, too. Would you mind sending me DecBot's links? I would prefer some plain-text format, if you can get that (And, I'll probably be able to decipher it however you send it, so it doesn't need to be in XML or some other format like that. Maybe Merth_->Merth->Shaun, as an example). I thought about that while I was working on this, but I forgot about it since coming up with all the links would be too much work, and I forgot that DecBot had his list.

Edit: While I'm thinking about it: I also want to check who does the longest runs, where a run is the same person saying 3 or more lines in a row, in 1, 3, and 5 minutes. Another interesting thing to try is to see who does the most increments, and what it is they increment. Not to mention seeing if they increment a certain thing several times (I know that I periodically do Nightwish++ or Pandora++ while I'm listening to them/it).
Yeah, I talk a lot. Didn't know it was THAT much, though it's probably at least 5000-10000 lines more if you take into account my aliases Ashgood and TwiSpark in IRC. Well, I do chat here all day erry'day it seems for the past year or so. As for "high-quality" discussion, I guess I do a lot of that, besides when I'm talking about ponies or random crap with you, Cathy. Seems the same on your part too, except s/ponies/other offtopic things/.
_player1537 wrote:
merthsoft wrote:
It would be interesting o see the counts of people speaking with linked names. For example, I'm probably half and half between "shaun" and "merth". I can get you a dump of decbot's links if you want.

I'm interested in seeing that, too. Would you mind sending me DecBot's links? I would prefer some plain-text format, if you can get that (And, I'll probably be able to decipher it however you send it, so it doesn't need to be in XML or some other format like that. Maybe Merth_->Merth->Shaun, as an example). I thought about that while I was working on this, but I forgot about it since coming up with all the links would be too much work, and I forgot that DecBot had his list.
I can get that to you probably tonight, but you'll probably need to remind me. I don't store chains in the table, just terminals, so instead of Merth_->Merth->Shaun, it's just Merth_->Shaun and Merth->Shaun. Also, I'm case insensitive, and store them all as lower case, you might should do the same (someone saying "Merth: " and "merth: " are really the same thing).
merthsoft wrote:
I can get that to you probably tonight, but you'll probably need to remind me. I don't store chains in the table, just terminals, so instead of Merth_->Merth->Shaun, it's just Merth_->Shaun and Merth->Shaun. Also, I'm case insensitive, and store them all as lower case, you might should do the same (someone saying "Merth: " and "merth: " are really the same thing).

Okay, I'll remind you later tonight. And, that's fine, that format is probably easier to get what I need from it. I do everything case-insensitively. At the top of my code I have:

Code:
$user =~ y!A-Z!a-z!;
$msg =~ y!A-Z!a-z!;

So that everything is converted into lowercase.
http://merthsoft.com/declinks.txt
This should give you what you need. It's tab delimited.
The statistics seem to be right just from sitting in IRC. It would be fun if saxjax watched the IRC channel and recorded stats, showing up on your user profile and on a public page. That would get interesting Smile

merthsoft wrote:
http://merthsoft.com/declinks.txt
This should give you what you need. It's tab delimited.

Evil or Very Mad I am linked 7 times, merth is linked 15 times, KermM is linked 20 times, etc.
These are fascinating statistics, Catherine; thanks for collecting them! Can you also calculate things like most active times of day and days of the week?
Thanks! Very Happy Sure, I can try that. I was thinking about making a graph of the peak times of the days, and what days I was missing/my client was down, as well as what days were the most or least active. I'm going to update the first post when I get home (or while I'm at work if I feel like starting up X/Firefox) of all the things that changed since using Merth's links. One of the larger changes is that, yes, Merth did talk about half-and-half as Merth and as Shaun. This bumped him up to #3 in the Most Highlighted Users section. I'll run a quick test later to see exactly how split he was, but I think it was about even.

I also went ahead and did some checks to see who did the most "runs", and Kerm is at the top for the "short" runs, which were 1 minute long. He had 205 instances of it, while Geekboy (#2) had 55.
Heh, but I bet that the average line length in those runs was different for myself and geekboy. Wink Any numbers on that? And yes, those sorts of graphs would be perfect; I'd love to see them.
I went through and calculated the average line lengths for all the people in the shortruns list. You're at ~7.9 characters, and Geekboy's at ~6.5 characters on average. Ashbad beats everyone in the short list, with ~10.5 characters.

I also went ahead and checked what the longest lines are for those. Someone named "jared__" did ~108.5 character runs. Then BrandonW does ~33.5 character ones, and Iambian: ~27.8.

I'll work on the graphs probably when I get home (I've been slacking off at work to do this Razz) so be sure to poke me about it if I forget.

Edit: I updated the first post with the new data I got Smile
pisg does a lot of channel stats stuff. You might want to look at that for ideas.

By the way, according to my client's chanpeak script, the record for #cemetech (since February 2010) is 73 nicks on Nov. 18, 2011 21:45:08 (US Central). This doesn't count Sax users, though.
Fffffuuuuuuuuuuuuuu


18725 is a fricking ton of lines. (Ofc, most of it was probably spam Razz)

I'm very surprised that I'm on here this much. My mental picture doesn't comply with me talking that much...
seana11 wrote:
Fffffuuuuuuuuuuuuuu


18725 is a fricking ton of lines. (Ofc, most of it was probably spam Razz)

I'm very surprised that I'm on here this much. My mental picture doesn't comply with me talking that much...
You mostly just say useless stuff that doesn't matter to anyone.
Note to self: make a trie of all the words we used in the channel to come up with a "Cemetech auto-complete" mode.
I'm curious what you mean by runs? Is this just a sequence of lines said consecutively?
Exactly. It's when a user says something that spans 4 lines without being interrupted by another person, and it's within X minutes of time.
merthsoft wrote:
seana11 wrote:
Fffffuuuuuuuuuuuuuu


18725 is a fricking ton of lines. (Ofc, most of it was probably spam Razz)

I'm very surprised that I'm on here this much. My mental picture doesn't comply with me talking that much...
You mostly just say useless stuff that doesn't matter to anyone.
Yup, pretty much accurate.

Catherine: Do you mean *at least four lines?
  
Register to Join the Conversation
Have your own thoughts to add to this or any other topic? Want to ask a question, offer a suggestion, share your own programs and projects, upload a file to the file archives, get help with calculator and computer programming, or simply chat with like-minded coders and tech and calculator enthusiasts via the site-wide AJAX SAX widget? Registration for a free Cemetech account only takes a minute.

» Go to Registration page
Page 1 of 2
» All times are UTC - 5 Hours
 
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum

 

Advertisement