LogBlog

Anton Logging Tip of the Day #15: Fear and Loathing in Event 560 (and 562 and 567)

Following the new "tradition" of posting a security tip of the week (mentioned here, here ; SANS jumped in as well), I decided to follow along and join the initiative. One of the bloggers called it "pay it forward" to the community.

So, Anton Logging Tip of the Day #15: Fear and Loathing in Event 560 (and 562 and 567)

This tip digs into a seemingly simple, but really VERY esoteric subject: monitoring file access and modification via a Windows event log. Now, some people - who never studied this subject - tend to have a very simplistic view of this: just enable Object Access auditing, then right-click on a file or directory, click Security->Advanced->Auditing and then pick what types of events will be logged and by what accessing entities (i.e. users or computers). OK, so this will produce some logs, that is for sure. But are they useful?

First, why are we doing this? We typically need to know the following when we audit file access in Windows (or any other OS for that matter) for security (monitoring and investigation) or compliance:

Can we get this from the above logs? No.

What? No!?! Really?

Yes, really. We can get some of the above, some of the time, not all of the above, all of the time. Here is an example, we are looking at event ID 560 (picture) and then at an extract from its description field.

Event:

event_log560_1_thumb

Description (selected field):

Object Server: Security

Object Type: File

Object Name: C:\0\TestBed\simple_text_file.txt

Image File Name: C:\WINDOWS\system32\notepad.exe

Primary User Name: Anton

Primary Domain: XXXXXX

Accesses: READ_CONTROL

SYNCHRONIZE

ReadData (or ListDirectory)

WriteData (or AddFile)

AppendData (or AddSubdirectory or CreatePipeInstance)

ReadEA

WriteEA

ReadAttributes

WriteAttributes

 

WTH is that? Well, we know that the user  'Anton' has successfully read? wrote? changed attributes? did something? with a file named "C:\0\TestBed\simple_text_file.txt" using a program named "C:\WINDOWS\system32\notepad.exe." That's the best we can get, in this case! We may try to look at event IDs 562 and 567, but this missing information (i.e. the exact action performed) will not be added.

BTW, there will be  a few more dozen (sometime hundreds!) of the 560s, 562s and 567s  produced - all from just opening the text file in a notepad. The above event is notable for having BOTH "notepad" and "simple_text_file.txt" in the same event; others will have either of the two.

Anything else gets in the way? Yes, lots! MS Office will write to all files, even just opened for reading (with no user modifications to the content whatsoever), which will screw up your log monitoring efforts. If the file is on a share, more information will be missing (e.g. username might be).

So, how to use Windows event logs for file access tracking?

  1. Enable logging (as described above)
  2. Pick events 560 (most useful) and 562, 567 (useful too)
  3. Look for fun filenames that might be touched by the users (have a list of files and users handy)
  4. Figure out what programs were used to access them (this is called "Image File Name" in "WinLogSpeak")
  5. Ponder the 'Accesses' section of each event until your brain turns blue :-) or until you decide whether such access is authorized or not...

Overall, this is still very useful for file access monitoring, but the process is somewhat painful.

BTW, I am tagging all the tips on my del.icio.us feed. Here is the link: All Security Tips of the Day.

Technorati tags: , , ,

Posted by Anton Chuvakin on May 08, 2008 in Compliance , Innovation , Log Management & Intelligence , Security | Permalink | TrackBack (0)

Logging Poll #8 Context for Log Analysis

So, my next poll is up - and it is fun (but more technical): what information is most useful when trying to make sense of a log entry?
Vote here! Analysis will be posted here in a few weeks.

Past polls:

  • Poll #7 "What tools do you use for Windows Event Log collection?" (analysis)
  • Poll #6 "Which logs do you LOOK at?" (analysis)
  • Poll #5 "What are your top challenges with logs?" (analysis)
  • Poll #4 "Who looks at logs in your organization?" (analysis)
  • Poll #3 "What do you do with logs?" (analysis)
  • Poll #2 "Why collect logs?" (analysis)
  • Poll #1 "Which logs do you collect?" (analysis)

     

    Technorati tags: , ,
  • Posted by Anton Chuvakin on May 05, 2008 in Innovation , Log Management & Intelligence , LogEd | Permalink | TrackBack (0)

    Critical Log Management Questions - Answered!

    Here are some interesting log management questions I got asked some time ago; reposting here for our blog readers.

    Q1: For those companies that have successfully implemented enterprise-wide logging, what were the big nasty surprises that they encountered?

    A1: Here are a few:

    Q2: For those companies that have successfully implemented enterprise-wide logging, what was their implementation approach?

    A2: Typically, 2-3 vendor PoC or pilot first. Then with the chosen vendor: phased approach based on location + type of log source (e.g. firewalls, then routers, then OS, then proxies, etc) + network topology (e.g. DMZ, then internal) + log source criticality (e.g. critical servers first; the rest next). This might be handy to look at.

    Q3: What kind of storage requirements have been experienced by those organizations who have successfully implemented enterprise-wide logging?

    A3: Massive? :-)

    Here is a simple example: PCI DSS is a bit more aggressive than NERC since it mandates 1 year of log retention vs NERC 90 days, so: 1 year worth of logs is = 365 days x 24 hours x 3600 seconds x 1 (one!!!) busy firewall with 100 log messages each second x 200 bytes per message average (e.g. valid for PIX and ASA devices) = 588 gigabytes / year of raw log data uncompressed (assuming 10x compression you'd get about 60GB of compressed log data per year)

    Store it in RDBMS? Multiple it by 2-3. Have an index? Add about 30%.

    The bottom line is: terabyte is the unit to measure logs.

    Q4: At the organizations that have successfully implemented enterprise-wide logging, how logging impacted network and system performance?

    A4: Too broad a question, so here are a few pointers:

    Q5: What were some successful strategies for obtaining buy-in from system owners and operators in regards to turning logging on?

    A5: OK, also too broad a question, but here are some pointers:

    Q6: How the organizations that have successfully implemented enterprise-wide logging dealt with unusual devices (=log sources) that have no log management vendor support?

    A6: They were in massive pain - if they choose a log management vendor wrong. You need to look for vendors that have "universal log source support" with NO requirement for a custom rules or custom collector/connector/agent development. LogLogic have generic text log collectors that can grab and analyze unknown logs. Typically this is done via some form of text indexing that works across all logs, including those from unknown, vertical, esoteric or custom-developed log sources

    Hope it was useful!

    Technorati tags: ,

    Posted by Anton Chuvakin on April 24, 2008 in Compliance , Innovation , Log Management & Intelligence , LogEd | Permalink | TrackBack (0)

    From Log Apathy to Log Enlightenment

    So, I was talking to this small log management vendor the other day and he confided to me that his product faces fierce competition in his target market (which is, important to note, small to medium companies with 10-100 systems): and this competition is apathy.

    More specifically, his prospects either just blow him off by saying "pah, who needs logging!" or they profess their undying love of all things logging - and then still don't buy his product (which is priced, shall we say, "to go")

    Admittedly - and somewhat tongue-in-cheek, these are the same companies that form the core of today's botnets (due to various reasons including their scarce resources) and enable RBN to deliver high-quality malicious services to criminal enterprises worldwide. Still, if you happen to have thoughts along the line of "who needs logs?" or "ah, logging? it will come later!", you really deserve a nice fat check from RBN and other malicious "hacking" syndicates since it is extremely likely that your overall attitude towards security is just as misguided...

    But how to progress from such ... what was before the Stone Age? ... Sharpened Stick Age? to modernity? Most companies go through the following stages in regards  to their logging:

    1. Deep log ignorance: "Logs? What are those?"
    2. Shallow log ignorance: "Later...later...later... #37 on the TODO list."
    3. Log collection: "We gather and store dead log data...cold."
    4. Log searching: "We will dig into the pile when we have to ... hopefully never!"
    5. Log analysis and reporting: "We know our logs - and what they mean"

    (also see my post "Natural Flow of Log Management" for some specifics)

    Of course, compliance (PCI DSS and others) helped move people from 1. and 2. to 3., but, sadly,  people often get stuck at 3. (just collection) or 4. (collection  + maybe search) and never progress to Logging Enlightenment of 5.

    Yes, PCI DSS and other regulations mandate not just log collection, not just dead cold log storage, but also log review (daily, in case of PCI DSS Requirement 10), but "review" happens to be the item that gets overlooked  all too often.

    Why is that?

    I think the reason is that log analysis is still too hard and still not automated enough for an average organization. Yes, I did see some corporations that built their own log analysis systems that - surprise! - exceeded the best available from the vendors [at the time]. However, a typical company IT department would not have Ph.D. poring through hardcore text mining research papers in order to improve their home-grown log analysis AI. They expect the vendors to  eat the logs, chew on them for a bit - and then spit out the answers.

    Are we there yet? No, but we will be!!!

    Technorati tags: , ,

    Posted by Anton Chuvakin on April 22, 2008 in Compliance , Innovation , Log Management & Intelligence , Security | Permalink | TrackBack (0)

    RSA 2008 Summary

    So, RSA 2008 has come and gone.  Now that everybody has recovered from the information overflow, it is time for summaries and reflections. Before we begin, go read my RSA Impressions (Part 1,2,3,4). Next, read what others said about RSA 2008 (via my del.icio.us feed). Then reflect on past RSA shows (2006, 2007).

    Ready now?

    First, what was the theme? Unlike in the past, I personally couldn't pick any. The candidates were GRC (and compliance + risk management) , DLP (and other data-leak/-theft technologies), IAM (and identity enablement), but none quite measured up to being a show "theme."

    What I saw too much off? Even though their numbers have shrunk, I still saw too many NAC vendors there (Lockdown, here we come!). One of my friends joked that there were more "GRC vendors" than NAC vendors, but both were in low enough numbers to make it a trend. As far as loud noises back from RSA 2007, "identity-driven this or that for security" was still very visible. What is also bizarre is that people still start log management companies. I saw a couple of new ones - amazing, isn't it? Yes, it is a hot space, but come on! Accept that LogLogic is the leader and be done with it :-)

    Overblown messages? "Information-centricity." It was cool and hot :-) and new when Hoff said it.  But when it trickled down to keynotes of some "trailing edge" vendor executive, it became boring and stale. And, while we talk about "information centricity" people must still worry about "A" (availability) first (see this discussion).

    What I didn't see enough of? VOIP security. Somehow this previously hot trend is quieted down. Also, I saw a lot of web application security vendors, but I think that is still not enough as this is an area with a raging fire, not just "some hotness." Also, I expected to see more vendors messaging around (and, actually helping with!) fraud. Dan Geer's Verdasys kinda mentioned that, but pretty much in passing. Is fraud handled outside of security (and thus out of RSA)? I am not sure.

    What I didn't see at all? I didn't see much "market consolidation" - no huge deals, no vendors of note "taken out", etc. Still a huge number of security companies around ... some with products that seem deeply out of touch with today's threat landscape.  One of the speakers said that nowadays "no single security pure-play expected to change the world", but it sure seems like many will try!   Along the same line, Mike R said that such shows are 18-24 months ahead of what "normal" people deploy. This might explain the VOIP and other missing items.

    As I said before, "consumerization" of IT - i.e. IT infrastructure, servers, laptops, storage, services, computing resources, applications, etc provisioned outside of IT departments was an elephant in the room. It is not simply "unmanaged IT" or "consumer-grade IT for business", it is the whole "not-IT-department IT" phenomenon. Yes, via mashups it even includes "non-IT application development" (read this fun 451Group report on that). Security implications of this are nothing short of ginormous.

    In light of this, I liked how one presenter said this: "we lost the desktop" - meaning "1/3 is managed by users [not IT], 1/3 is unmanaged and 1/3 is compromised/ "managed" by hackers."  Sad but true... Dave Aitel used to joke how in the future banks will have to "re-compromise" your PC so that some temporary security can be established for you to transact business with them. Are such horrifying times upon us already? Maybe desktop virtualization will help with this ...

    Overall, RSA 2008 show was an enjoyable, enlightening and thought-provoking experience, as usual. I've heard it was the same even for people who didn't attend a single presentation - indeed, RSA is about people, not slides...

    See you next year at RSA 2009!

    Technorati tags: , , ,

    Posted by Anton Chuvakin on April 16, 2008 in Compliance , Innovation , Log Management & Intelligence , Risk Management , Security | Permalink | TrackBack (0)

    On Trusting Log Timestamps

    It is about time: specifically, timing logs. As I said in my Log Trust and Protecting Logs from Admins posts, the issue of trust is critical in the logging world. After all, logs = accountability; and the latter in unthinkable without trust. If we are to at least pretend that logs objectively record events and user actions, we need to unambiguously establish WHAT happened and WHEN. This post deals with the 'WHEN'  issue.

    So, can we trust that the time stamp in the log file or the one added by the log management system correctly describes when the recorded event actually happened?

    We will start from locating the timestamps in logs. Most of the log formats, such as file-based logs (web servers and proxies, various application, some security gear, etc) and syslog, Windows event logs, database audit tables, etc contain a timestamp. Here are a few examples:

    1. Mar 20 10:03:47
    2. 19/Mar/2006:04:27:33 -0800
    3. Sun Mar 19 04:45:19 2006

    In fact, once I saw somebody use a timestamp to define logs as "timed records of IT activity." So, time is critical for logs being, well, logs :-) At this point it is worthwhile to note that file-based logs will contain a timestamp IN the file, while syslog records arriving over the UDP or TCP port 514 connection are usually timestamped upon arrival BY the syslog daemon (using its own "knowledge" of time) - and then it shows up in the syslog files in /var/log.

    Let's assess whether this "in-log timestamp" provides an adequate way of timing the actual events that are being logged. Answering this question is important for investigations and troubleshooting, but becomes  nearly a matter of life and death in case of log forensics. Here are some fun cases and issues to consider:

    First, what are the chances of a  completely false timestamp in logs (BTW, today is Jan 1, 1970!) When might that happen? Typically when a logging system own clock is reset or not set correctly. This timestamp clearly should NOT be trusted. How do you then time this event? This is a separate story that will be discussed some other time...

    Second, it’s always 5PM somewhere: in other words, what time zone are your logs in? EST? PDT? GMT? UTC? Or any of more than 24 other possibilities. If you have no idea, you should not trust the timestamp.

    Third, are you in drift? Is your system clock? Those pesky drift seconds turn into minutes that then work to undermine the accuracy of timing the records (and thus your certainly and trust in evidence quality)

    Fourth, syslog forwarder mysteries are plenty: some of the syslog messages will be delayed in transit and then  be timestamped by the final recipient daemon, thus completely losing when the event was originally logged. Admittedly, this delayed syslog messages are rare, but as more people employ buffering syslog daemons (e.g. syslog-ng), it might happen more often.

    Fifth, more esoteric, but still real (and really annoying!): some system logs will contain two timestamps. If you don't possess in-depth knowledge of this specific log, confusion has a chance to undermine the trust as well (so, which timestamp should I use?)

    Sixth, most people will not think that they will fall to something that stupid: 24 vs 12 hour time. However, when facing an unknown (and poorly designed!) log format, beware that 5:17 might well be 17:17...

    Finally, if you know that something got logged at 5:17AM, then when did it happen? Beware of "Log lag!" This issue is actually too tricky to give it justice here... The simplest example is when the process leaves a log record when it exits, not when it starts, which might happen days earlier (thus creating a log lag).

    As we dive into more issues with timing logs, we also need to think about sequence timing and absolute timing. Sequence of logged events is a critical fact! Miss the sequence and the whole “house of cards” goes …  But! Absolute time is also important! Can we be assured of both being correct all the time? (hint: no)

    So,  when you look at logs next time and you see a timestamp there - start thinking about all of this :-)

    Technorati tags: , ,

    Posted by Anton Chuvakin on April 04, 2008 in Innovation , Log Management & Intelligence , Risk Management , Security | Permalink | TrackBack (0)

    Anton Logging Tip of the Day #14: More access_log Fun: What Are You Not GETting?

    Following the tradition of posting a tip of the week (mentioned here, here ; SANS jumped in as well), I decided to follow along and join the initiative. One of the bloggers called it "pay it forward" to the community.

    So, Anton Logging Tip of the Week #14: More access_log Fun: What Are You Not GETting?

    In this tip, we will look at some bizarre artifacts that show up in web server access logs today. Here we have a production log from an Apache web server that is full of interesting (and sometimes ominous!) little mysteries that we will investigate in order to determine their impact on security and operational health of the site.

    Logs do contain more mysteries than we have time, so we will focus on a few of them: specifically, unusual web request methods.  Let's see who is trying to POST or use some other method (OPTIONS, HEAD, PUT or something - see a list here) on our site, instead of just GET'ting the content (GET command is used by web browsers to retrieve the pages, while POST is used to upload content, press buttons, etc  - at least in "web 1.0" land  - see earlier tip #12 where POST request was found in proxy logs)

    Here is one little artifact that attracted my attention due to a POST request vs a web forum as well as a battery of slashes (which actually increases in subsequent request - of which there were many)

    10.10.102.250 - - [12/Feb/2008:16:10:50 -0500] "POST /phpBB3////ucp.php?mode=register&sid=e5efaa77a777066c61f71808e9e57b19 HTTP/1.0" 200 14397 http://www.example.com/phpBB3///ucp.php?mode=confirm&id=7640df05c7e24b7acf7a68800fe6dc59&type=1&sid=e5efaa77a777066c61f71808e9e57b19 "Mozilla/5.0 (Windows; U; WinNT4.0; en-US; rv:1.2) Gecko/20021126"

    ... more...

    10.10.102.250 - - [12/Feb/2008:16:12:29 -0500] "POST /phpBB3///////////////ucp.php?mode=login&sid=e5efaa77a777066c61f71808e9e57b19 HTTP/1.0" 200 9355 "http://www.example.com/phpBB3//////////////ucp.php?mode=login&sid=e5efaa77a777066c61f71808e9e57b19" "Mozilla/5.0 (Windows; U; WinNT4.0; en-US; rv:1.2) Gecko/20021126"

    This one really is a mystery; what do we know about it? The server responded to the request OK (code 200), so the POST actually happened. The first request was a request to register with a web discussion board and the second was a request to login. Multiple slashes are  actually ignored  by the web server, so why put them in the request (no answer)? Also, I think that the User-Agent is spoofed ... do you know why? Finally, if I see something like that in my logs, I will definitely investigate it, primarily due to the fact that Apache responded with 200 OK code.

    The next one is so classic it it dumb (and so dumb, it's a classic :-))

    10.10.123.226 - - [12/Feb/2008:03:46:54 -0800] "POST /_vti_bin/shtml.exe/_vti_rpc HTTP/1.1" 404 - "-" "MSFrontPage/6.0"

    10.10.123.226 - - [12/Feb/2008:03:46:55 -0800] "OPTIONS / HTTP/1.1" 200 20210 "-" "Microsoft Data Access Internet Publishing Provider Protocol Discovery"

    It is probably one of the ancient IIS attacks (check out this fun BlackHat preso on that, circa 2003) - why would someone probe for it now is beyond me. In any case, Apache on Linux and "*.exe" don't mix :-)

    The final log record is also fun:

    10.10.101.222 - - [12/Feb/2008:15:33:22 -0800] "PUT /zk.txt HTTP/1.0" 405 223 "-" "Microsoft Data Access Internet Publishing Provider DAV 1.1"

    The above uses a PUT request which is pretty much deprecated now; the purpose of the above is clearly malicious. In fact, modern Apache shouldn't even allow it, thus it responds with code 405 "Method Not Allowed." Nothing to worry about (even though some poor critter got owned with that! BTW, if you follow that link, check out HTTP response code 201 - if you see it in your logs, run! :-))

    Overall, if you see too many POSTs or too many "GET then POST" sequences from the same IP in rapid succession, investigate it since no legitimate access should produce such a pattern...

    As further reading, I heartily recommend this paper: "Detecting Attacks on Web Applications from Log Files"

    Also, I am tagging all the tips on my del.icio.us feed. Here is the link: All Security Tips of the Day.

    Posted by Anton Chuvakin on March 12, 2008 in Innovation , Log Management & Intelligence | Permalink | TrackBack (0)

    Poll #7: What tools do you use for Windows Event Log collection?

    My next fun logging poll is here - please vote! It is about tools for centralized collection of Windows Event Log from servers and other systems. One of the somewhat surprising discoveries from my previous poll was that few people look at Windows logs; this poll drills down into it.

    And, don't forget that ProjectLASSO Windows event collector allows people to grab Windows event logs remotely without those hated agents...

    Past logging polls and their analysis:

  • Poll #6 "Which Logs Do You LOOK At?" (analysis)
  • Poll #5 "What are your top challenges with logs?" (analysis)
  • Poll #4 "Who looks at logs in your organization?" (analysis)
  • Poll #3 "What do you do with Logs?" (analysis)
  • Poll #2 "Why collect logs?" (analysis)
  • Poll #1 "Which logs do you collect?" (analysis)
  •  

    Technorati tags: , , ,

    Posted by Anton Chuvakin on March 07, 2008 in Innovation , Log Management & Intelligence , Project Lasso | Permalink | TrackBack (0)

    Logging Poll #6 "Which Logs Do You LOOK At?" Analysis

    This poll on looking at logs  poll was relatively popular; lets see what we can learn (live results are also here).

    image_thumb2

    First, what are the top 3 log types that people look at? They are:

    1. Unix/Linux server syslog
    2. Web server logs
    3. Firewall logs

    How does that compare with the top 3 log types that people collect (see picture showing results from my previous poll below)?

    image_thumb4

    These are:

    1. Unix/Linux server syslog
    2. Firewall logs
    3. Web server logs

    Huh? They are the same - doesn't it just make sense? What are the possibilities here?

    a. People only collect the logs they plan to look at, OR

    b. People only look at logs they collect (duh!).

    Strangely, I find a) unlikely; I think most people collect more than they can review and that the incident/issue response and compliance needs drive collection more than review or analysis.

    Another observation is that all of the "big 3" log types are useful for security, operations and compliance and not just for security (like NIDS/NIPS logs). Is that why they are so popular?

    Second, I was fearful that "I only look at whatever logs needed for the incident/issue investigation" will win. It didn't!!! This to me indicates that proactive log review is not as unpopular as I feared. Good! It is working.

    Third, obviously, nobody (well, 4%...) looks at all logs they collect.

    Fourth, much more people look at Unix/Linux logs than Windows server logs (factor of 3x); this is not entirely unexpected and my next poll will drill down into this.

    Finally, I am SHOCKED that people don't look at NIDS/NIPS logs (only 11% do). Why have you deployed those "beasts" if you don't look at what they produce? Then again, maybe you haven't...

    Next poll coming up!

    Technorati tags: , ,

    Posted by Anton Chuvakin on March 07, 2008 in Innovation , Log Management & Intelligence | Permalink | TrackBack (0)

    Audit/Monitor Controls or Audit/Monitor BEFORE Control?

    Back in 2004, Forrester paper called "The Natural Order Of Security Yields The Greatest Benefits" proclaimed that "the adoption of security has a natural order: 1) authentication; 2) authorization; 3) administration and 4) audit." Note that audit which, in this case, broadly includes audit, monitoring and detection, comes last. It seems to be fairly in line with common sense: you audit the controls after you put them in place; you monitor after you have authentication and authorization taken care of and you detect the violations after you organized your administration.

    The paper clarifies: "With people and contexts defined, protective controls in place, and policies outlined, the fourth set of questions [i.e. regarding the audit] includes: “What happened?”, “What is happening?” or, especially, “Is it working?”

    However, is this really so? Or, is this always so? First, when reality collided with plans,  many of the organizations that followed that wisdom got mired in one phase (e.g. in trying to get authentication under control) and ended up having no audit whatsoever: in other words, they are flying blind while implementing controls!  Second, in some cases controls (authentication, authorization, administration) will actually be impossible to implement, while audit will be possible. Imagine retrofitting a legacy application for granular authorization? Third, in some cases implementing prevention/control will be much more complicated, compared to implementing audit: thus, people will face a choice between a half-baked control to a full-blown audit capability? An example will be managing which file each user can access vs monitoring/auditing which file each user have accessed. The latter is doable, while the former is next to impossible. Another way to phrase it is "reactive but possible" vs "proactive but impossible" (hint: pick the former :-))

    I think the idea of putting audit first in some cases is the way we'd need to progress. "Wow, what a blasphemy!", some might say ...  After all, if you have not defined controls, what are you going to audit? But remember that audit is meant broadly in this context and thus the opposite question is very relevant: what are you going to control if you have no idea what is going on? People sometimes define a security policy based on how things should be (and then WSJ happens - refresh your memory of the "WSJ saga" here), but then spent years trying to bring policy and reality together (and end up with an environment which is "half-controlled") Won't it be better to audit first, then control?  Or, at the very least, to run BOTH project at the same time so that improvements in control are audited and audit discoveries lead to control improvements!

    Obviously, "do IDS before IPS" falls under the same principle: monitor first, [try to] control second. Here is another example: implement log management before identity management. Looking at logs will tell you what privileges the users actually use for doing their daily jobs. Then you can mix it up with what the idea access policy will be.

    So, think about it! Questioning the common wisdom does often bring interesting insights.

    Posted by Anton Chuvakin on March 06, 2008 in Innovation , Log Management & Intelligence , Security | Permalink | TrackBack (0)

    Audit/Monitor Controls or Audit/Monitor BEFORE Control?

    Back in 2004, Forrester paper called "The Natural Order Of Security Yields The Greatest Benefits" proclaimed that "the adoption of security has a natural order: 1) authentication; 2) authorization; 3) administration and 4) audit." Note that audit which, in this case, broadly includes audit, monitoring and detection, comes last. It seems to be fairly in line with common sense: you audit the controls after you put them in place; you monitor after you have authentication and authorization taken care of and you detect the violations after you organized your administration.

    The paper clarifies: "With people and contexts defined, protective controls in place, and policies outlined, the fourth set of questions [i.e. regarding the audit] includes: “What happened?”, “What is happening?” or, especially, “Is it working?”

    However, is this really so? Or, is this always so? First, when reality collided with plans,  many of the organizations that followed that wisdom got mired in one phase (e.g. in trying to get authentication under control) and ended up having no audit whatsoever: in other words, they are flying blind while implementing controls!  Second, in some cases controls (authentication, authorization, administration) will actually be impossible to implement, while audit will be possible. Imagine retrofitting a legacy application for granular authorization? Third, in some cases implementing prevention/control will be much more complicated, compared to implementing audit: thus, people will face a choice between a half-baked control to a full-blown audit capability? An example will be managing which file each user can access vs monitoring/auditing which file each user have accessed. The latter is doable, while the former is next to impossible. Another way to phrase it is "reactive but possible" vs "proactive but impossible" (hint: pick the former :-))

    I think the idea of putting audit first in some cases is the way we'd need to progress. "Wow, what a blasphemy!", some might say ...  After all, if you have not defined controls, what are you going to audit? But remember that audit is meant broadly in this context and thus the opposite question is very relevant: what are you going to control if you have no idea what is going on? People sometimes define a security policy based on how things should be (and then WSJ happens - refresh your memory of the "WSJ saga" here), but then spent years trying to bring policy and reality together (and end up with an environment which is "half-controlled") Won't it be better to audit first, then control?  Or, at the very least, to run BOTH project at the same time so that improvements in control are audited and audit discoveries lead to control improvements!

    Obviously, "do IDS before IPS" falls under the same principle: monitor first, [try to] control second. Here is another example: implement log management before identity management. Looking at logs will tell you what privileges the users actually use for doing their daily jobs. Then you can mix it up with what the idea access policy will be.

    So, think about it! Questioning the common wisdom does often bring interesting insights.

    Posted by Anton Chuvakin on March 06, 2008 in Innovation , Log Management & Intelligence , Security | Permalink | TrackBack (0)

    Top 11 Reasons to Analyze Your Logs

    As promised, here is another "Top 11 Reasons" which is about log analysis. Don't just read your logs (definitely don't just collect them); analyze them. Why? Here are the reasons:

    1. Seen an obscure log message lately? Me too - in fact, everybody have. How do you know what it means (and logs usually do mean something) without analysis? At the very least, you might need to bring additional context to know what some logs mean (example: IP address -> hostname -> server owner)
    2. Logs often measure in gigabytes and soon will in terabytes; log volume grows all the time - it definitely passed the  limit of what a human can read a long time ago, it then made simple filtering 'what logs to read' impossible as well: automated log analysis is the only choice.
    3. Do you peruse your logs in real time? This is simply absurd! However, automated real-time analysis is entirely possible (and some logs do crave for your attention ASAP - e.g. major system failures, confirmed intrusions, etc)
    4. Can you read multiple logs at the same time? Yes, kind of, if you print them out on multiple pages to correlate (yes, I've seen this done :-)). Is this efficient? God, no! Correlation across logs of different types is one of the most useful approaches to log analysis.
    5. A lot of insight hides in "sparse" logs, logs where a single record barely matters, but a large aggregate does (e.g. from one "connection allowed" firewall log to a scan pattern). Thus, the only way to extract that insight from a pool of data is through  algorithms that "condense" that collection of logs into usable knowledge (some say, visualization is the way to go)
    6. Ever did a manual log baselining? This is where you read the logs for a while and learn which ones are normal for your environment. Wonna do it again? Thought so :-)  Log baseline learning is a useful and simple log analysis technique, but humans can only do it for so much before burning out.
    7. OK, let's pick the important logs to review. Which ones are those? The right answer is "we don't know, until we see them." Thus, to even figure out which logs to read, you need automated analysis.
    8. Log analysis for compliance? Why, yes! Compliance is NOT only about log storage (e.g. see PCI DSS). How to highlight compliance-relevant messages? How to see which messages will lead to a violation? How do you satisfy those "daily log review" requirements (again, see PCI DSS)? Through automated analysis, of course!
    9. Logs  allow you to profile your users, your data and your resources/assets. Really? Yes, really: such profiling can then tell you if those users behave in an unusual manner (in fact, the oldest log analysis systems worked like that). Such techniques may help reach the holy grail of log analysis: have the system automatically tell you what matters for you!
    10. Ever tried to hire a log analysis expert? Those are few and far between. What if your junior analysts can suddenly analyze logs just as well? One log analysis system creator told me that his log data mining system enabled exactly that. Thus, saving a lot of money to his organization.
    11. Finally, can you predict future with your logs? I hope so! Research on predictive analytics is ongoing, but you can only do it with automated analysis tools, not with just your head alone (no matter how big :-)) ...

     Past top 11 reasons:

     

    Technorati tags: , ,

    Posted by Anton Chuvakin on February 22, 2008 in Innovation , LogMatters | Permalink | TrackBack (0)

    Logging Poll #5 "Top Logging Challenges" Analysis

    OK, this poll was insightful! The raw results are here and below:

    logpollchallengesresults_thumb

    What can we learn from this?

    First, what are the top challenges? It is with great regret :-) that I report that the #1 challenge is exactly the one I thought it would be: We collect logs but don't have time/resources to look at them. Yes, automated "analysis challenge" has only become more of a challenge as people deploy more tools that enable log collection on a massive scale (e.g. 75,000 logs/second). I dare to predict that we will finally have to tackle this one in the next year or two. In fact, this challenge rears it ugly head via another popular response, Lack of log analysis tools, which made Top 5 responses as well.

    Second, even though I didn't make any predictions about the #2 entry, but I was surprised: No way to effectively search all logs is a  very close #2 (obviously, 1 vote is not statistically significant here). Indeed, log searching is an elusive little problem, especially when we want to do it fast and on a large pool of logs. Even though I think we need to search less and discover more, the need to search logs will be with us forever (and, no, I don't think you need a special product just to do that ...)

    Third, I am happy to report that this poll indicates that we finally broke the back of "the beast" of  not having logs. Responses that point at not having logs (e.g. Logging is not enabled, We don't know what logging we must enable,  etc) are not terribly popular (then again, maybe it is due to self-selection of the blog readers ...)

    Fourth, infrastructure! Specifically, No infrastructure to manage the log volume we have is very popular as well (#4). This proves the point that I used to not take very seriously in the past (by mistake): when megabytes become gigabytes and those grow into terabytes, many things that used to trivial (e.g. moving logs from A to B, saving logs to disk, etc) become grand engineering challenges... Indeed, to manage high volume of logs you need a scalable log management solution (example)

    Sixth, as I lamented, few care about log security (this counts as laments, I guess).  Secure storage of logs is only bothering a few people. One word: yet! As of today, stored log hashing + (sometimes!) log transport encryption + (rarely!) encrypted archives are the state of the art.

    Next poll is coming up!

    Technorati tags: , ,

    Posted by Anton Chuvakin on February 08, 2008 in Innovation , Log Management & Intelligence , LogMatters | Permalink | TrackBack (0)

    SANS Security Laboratory Thought Leadership Interview With Dr Anton Chuvakin

    Here is an insightful interview with me done by Stephen Northcutt at SANS. I share a bunch of thoughts on logging and log management. For example, what is my #1 logging pet peeve, what's the #1 logging mistake, will we ever see log standards, why are we looking at an increase in the number of log types we need to look at, etc.

    It starts like this: "Dr. Anton Chuvakin from LogLogic has agreed to be interviewed by the Security Laboratory and we certainly thank him for his time! He is probably the number one authority on system logging in the world, and his employer is probably the leading vendor for logging, so we appreciate this opportunity to share in his insights."

    Technorati tags: , ,

    Posted by Anton Chuvakin on January 31, 2008 in Innovation , Log Management & Intelligence | Permalink | TrackBack (0)

    Poll: What are your top challenges with logs and logging?

    This poll is especially fun: What are your top challenges with logs and logging? Vote here.

    Past polls were:

  • Poll #4 "Who looks at logs in your organization?" (analysis)
  • Poll #3 "What Do You Do With Logs?" (analysis)
  • Poll #2 "Why Collect Logs?" (results so far, my analysis)
  • Poll #1 "Which Logs Do You Collect?" (analysis)

     

    Technorati tags: , , ,

  • Posted by Anton Chuvakin on January 21, 2008 in Innovation , Log Management & Intelligence | Permalink | TrackBack (0)

    Logging Poll #4 "Who Looks at Logs?" Analysis

    Time to analyze my final 2007 poll on logs. In it, I asked who actually looks at logs at the organization. Here is what came up: results are here and also included below.

     pollwholooks_thumb

    What can we conclude from this?

    First, an obvious conclusion is in order! No matter how many times one can utter the word "compliance," logs are still most useful for mundane (one would hope!) system administration. Yes, indeed, sysadmins are the primary consumers of logs - yesterday, today, and - likely! - tomorrow as well.

    Second, I am saddened by the fact that application developers have not warmed up to logs, at least not en masse (and not according to this limited poll...). I am guessing when they start thinking of logging when creating their applications, they will be more aware of the fact that you can troubleshoot the applications using logs ...

    Third, incident response team showing that low is some kind of fluke, I am sure. Everybody knows that logs are indispensable during incident response. Yes, even if only a little logging was enabled or even logging defaults left in place, logs often reveal answers unobtainable via any other mechanisms.

    Next poll coming soon!

    Technorati tags: , ,

    Posted by Anton Chuvakin on January 08, 2008 in Innovation , Log Management & Intelligence , LogEd , LogLogic News | Permalink | TrackBack (0)

    On Approaches to Database Monitoring

    So, people sometimes ask me about how to do database logging/auditing/monitoring and log analysis right. The key choice many seem to struggle with for database auditing and monitoring is reviewing database logs vs sniffing SQL traffic off the wire.  Before proceeding, please look for more background on database log management, auditing and monitoring in my database log management papers (longer, more detailed - shorter)  The table below summarizes the situation with database monitoring and auditing - now you can make your choice more intelligently (items in bold are the ones I consider key):

     

      Pro Con
    Sniff SQL traffic from the wire
    • No database performance impact
    • Awareness of returned content (for SELECTs)
    • Guaranteed role separation
    • Better for DBA monitoring
    • No agents
    • No database configuration changes
    • Extra device needs to be purchased, deployed and managed
    • Doesn't work with encryption
    • No local access monitoring
    Collect and analyze database logs
    • No extra $$$ - use your existing logging tool
    • Can user review activity across log sources, from databases to servers
    • Satisfies compliance demand for "database log review"
    • Can monitor ALL access to data in the database, even over APIs and local
    • Performance impact possible (*)
    • Database config changes needed
    • Usually not truly "real-time" (polling)

    Choose logs if you care for the relevant Pros (esp key ones) associated with them; choose sniffing if you care for the Pros and are NOT undermined by their Cons (e.g. difficulties of supporting encrypted traffic)

    Of course, one can also opt for a combined approach which follows the ideas of "double the benefits - for double the cost"...

    (*) Nobody really knows what it will be in each particular situation: 0-40% were observed under various conditions by various people ...

    Posted by Anton Chuvakin on December 17, 2007 in Innovation , Log Management & Intelligence , LogMatters | Permalink | TrackBack (0)

    Protecting Logs from Admins: A Lost Battle?

    One of the truly major challenges of log management is obtaining trusted logs of administrator activities, sometimes broadened to cover so-called "privileged users" (see my log trust pyramid post for more discussion). Many consider it to be the "lost battle" of logging. However, logging administrator access and actions is more important than ever today; it is one of the few workable ways of dealing with insider attacks.

    So, I created this little table where I try to summarize some ways of protecting the C-I-A (confidentiality, integrity, availability) of various logs from administrators and privileged users (some successful and some very hard to pull off). Note that in the case of databases and applications, we need to protect their logs from DBAs and application admins and not from the underlying server platform admins.

    UPDATE: table below might be corrupted in some blog readers; see the full table here.

     

      "C" - prevent admins from reading logs "I" - prevent admins from changing logs "A" - prevent admin from disabling logging
    Standard Unix Very hard! Maybe stealth logging (sebek) Remote logging via syslog to another server, append-only log files (via RBAC) Very hard! But this is logged and thus can be detected (also: stealth logging)
    Windows server Very hard! Maybe stealth logging (sebek for Windows) Pull the logs ASAP to a central server Very hard! But this is logged and can be detected (also stealth logging)
    Databases DBA activity log stored outside the database (append-only access) DBA activity log stored outside the database (append-only access) DBA activity log stored outside the database
    Firewalls and network gear Remote logging via syslog to another server - no local logging Remote logging via Very hard! But this is logged and can be detected
    IDS/IPS boxes Remote logging to another server - no local logging Remote logging to another server, inaccessible to admin Very hard! But this is logged and can be detected
    Misc enterprise applications App admin log outside the app (not readable to application user) App admin log outside the app (only appendable by the application user) Very hard! But this is logged and can be detected

    UPDATE: table above might be corrupted in some blog readers; see the full table here.

    Comments?

    Posted by Anton Chuvakin on November 26, 2007 in Innovation , Log Management & Intelligence , Risk Management , Security | Permalink | TrackBack (0)

    Imagining an Ideal Log Management Tool?

    The idea came from Jeremiah Grossman (here) when he described "The Best Web Application Vulnerability Scanner in the World" thus: "Within a few moments of pressing the scan button it’ll find every vulnerability, with zero false positives, generate a pretty looking report, and voila you’re compliant with GLBA, HIPAA, and PCI-DSS. Of course, we all know such a web application scanner is simply not possible to create for a variety of reasons."

    So, let's imagine the ideal log management application.

    1. Logging configuration: the ideal log app will go and find all possible log sources (systems, devices, applications, etc) and then enable the right kind of logging on them according to a high level policy given to it
    2. Log collection: it will collect all the above logs securely (and without using any risky super-user access ) and with little to no impact to networks and systems
    3. Log storage: it can security store the above logs in the original format for as long as needed and in a manner allowing quick access to them - in both raw and summarized/enriched form
    4. Log analysis: this ideal application will be able to look at all kinds of logs, known to it and previously unseen, from standard and custom log sources, and tell the user what they need to know about their environment and based on their needs: what is broken? what is hacked? where? what is in violation of regulations/policies? what will break soon? who is doing this stuff? The analysis will power all of the following: automated actions, real-time notifications, long-term historical analysis as well as compliance relevance analysis
    5. Information presentation: this tool will distill the above data, information and conclusions generated by the analytic components and present then in a manner consistent with the user's role: from operator to analyst to engineer to executive. Interactive visual and drillable text-based data presentation across all log sources. The users can also customize the data presentation based on their wishes and job needs, as well as information perception styles
    6. Automation: the ideal log management tool will be able to take limited automated actions to resolve discovered and confirmed issues as well as generate guidance to users so that they know what actions to take, when full-auto mode is not appropriate. The responses will range from full-auto actions to assisted actions ('click here to fix it') to issuing detailed remediation guidance. The output will include a TODO-list of discovered items complete with actions suggested, ordered by priority
    7. Compliance: this tool can also be used directly by auditors to validate or prove compliance with relevant regulations by using regulation-specific content and all the collected data. The tool will also point at gaps in data collection as it applies to specific regulations that the user is interested in complying

    In other words, this magic black box will have raw log data shoveled from one side and will have answers to key questions that concern the IT users and their managers on the other side.

    Am I nuts? Well, can I dream for a second? :-)

    Technorati tags: , , , ,

    Posted by Anton Chuvakin on November 06, 2007 in Innovation , Log Management & Intelligence , LogMatters | Permalink | TrackBack (0)

    Top 11 Reasons to Secure and Protect Your Logs

    Following my now-famous Top 11 Reasons to Collect and Preserve Computer Logs and Top 11 Reasons to Look at Your Logs, here is the promised "Top 11 Reasons to Secure and Protect Your Logs"

    1. Let's review why you are reviewing logs. Will logs that might have been changed by somebody, somewhere, somehow still be useful for items 1-11 from here? No? Secure them!
    2. Logs in court? Challenges abound! To respond to them, one needs to protect the logs so you can claim that they are both authentic and reliable.
    3. A human error still beats an evil hacker as the main cause of IT problems. Are your logs safe from it? Available when needed? Protect them from crashes and other faults!
    4. PCI DSS just says so: "Secure audit trails so they cannot be altered." Want to do it- or pay the fines?
    5. Do you protect financial records? Identity info? Passwords? Some of it ends up in logs (even if by mistake) - thus making them sensitive, protected information. Secure the C-I-A of logs!
    6. Do you look at logs during incident investigation? Do you want them to be "true" or full of random (if creative...) "stuff", inserted by the guilty party? Secure the logs!
    7. Think that "attacks vs logging" are theoretical? Think again. Are your logs safe or vulnerable? Is your logging tool 0wned?
    8. Syslog + UDP = log injection + log loss. Are you protected (reliable TCP, confirmed delivery, encryption - SSH, SSL, VPN)?
    9. Why modify logs? No, really, why change logs? If you never change logs - and you never should, unless it is appending to them - hash them right away after collection to make them immutable.
    10. Logs are backed up on tape - who will see them? Well, whoever restores the tape, that's who! Encrypt them to protect them from accidental and malicious disclosure if tape is lost.
    11. Why log access to logs? Same reason why you had the logs in the first place - to review who did what. Who broke through and stole the logs? Who browsed them without permission? Only logs will tell - if you have them!

    Overall, one need to strive for having no holes in log safeguards from log birth to analyst conclusion based on log information...

    Possibly related posts:

    Technorati tags: , , ,

    Posted by Anton Chuvakin on November 02, 2007 in Innovation , Log Management & Intelligence , LogMatters | Permalink | TrackBack (0)

    Just What is Enterprise Class? Part V

    Here is the final enlightening post from Dimitri McKay, our super-brilliant network and systems engineer from the East Coast, where he continues to discuss the meaning of the phrase "enterprise-class," which is certainly MUCH more than a marketing buzzword! Part I is herePart II is herePart III is herePart IV is here.

    Dimitri McKay says: "This is the final piece in the five part series entitled “Enterprise Class”. The term “Enterprise Class” for most of us has appeared on Bullshit-Bingo cards for years, but when we actually explore WHAT “Enterprise Class” is and how it pertains to a certain product, you can learn quite a bit about that product, and how it stands up to both the competition and how it stacks up against what it is and what it should be.

    As of this writing, LogLogic has set free it’s next code-base, LogLogic4 Release 2. With added features and fixes, LogLogic fits the Enterprise all the more with each release.

    13. Usability: Because of the large number of customers using LogLogic, the mission critical nature of the business problems being addressed, and the system complexity that is introduced due to the previous topics, usability is a critical factor for LogLogic. Our goal is to strike an appropriate balance between ease of use and complex functionality. Let’s face it. Using LogLogic to handle forensics, or compliance, or general reporting and alerting is an easy solution to a very complex problem. We are taking the chaos of a massive amount of log-traffic and we are taming it via the use of log forensics tools, summary reports, and real-time alerts.

    When I was on the support team one of the biggest questions customers would call and ask after purchasing the LogLogic appliance was “Now what? What should I be looking for?” - and the answer LogLogic came up with was the compliance suites. The compliance suites are a phenomenal way of helping the customer hit the ground running. It’s simple. “Install this suite of search filters, alerts and custom reports. Then tailor it to your network. Now you’ve solved your compliance problems and/or government mandates.”

    14. Compatibility/Maintainability: To minimize downtime as part of providing high availability, LogLogic must provide compatibility from one release to the next. Backward compatibility means that the when a new version of LogLogic  is installed, it continues to work with the previously installed customizations and other systems witch which it interacts such as source devices and remote authentication services.

     Forward compatibility means that when systems around the LogLogic are upgraded, the LogLogic must continue to function as it did before, even though the new software may support new log formats. When fixes are provided for LogLogic, downtime for installing the fix must be minimized by providing support for incremented updates such as patches and hot-fixes. LogLogic must also offer several solutions to handle updates such as automatic update via the web, or manual file updates due to the appliances being offline.

    Thank you for reading my five piece foray into the Enterprise. Please feel free to email me your thoughts, comments, or post them below. Stay tuned to LogLogic as we journey where no other LMI vendor has gone before."

    Technorati tags: , ,

    Posted by Anton Chuvakin on October 31, 2007 in Innovation , Log Management & Intelligence , LogMatters | Permalink | TrackBack (0)

    Log Collection Poll Results

    My poll on log collection have been a great success and now the results are in. Here is the link to the running totals as of now (the graphic snapshot from yesterday can be seen here). Let's review and discuss the findings after running it for slightly over a week.

    Next poll coming soon! Thanks a lot for responding!

    Possibly related posts:

     

    Technorati tags: , ,

    Posted by Anton Chuvakin on October 26, 2007 in Innovation , Log Management & Intelligence ,