Search This Blog

Loading...

Thursday, January 08, 2015

Datadog and many dataseries stacked together

Recently, I've started to use Datadog. It has nice features, but I have also found some annoying lacks. One of them is no easy way to prepare a graph with a stack of different series in one graph, for example nice representation of CPU time spent in different states.



Luckily, as you can see above it can be done. You just need to change some things in JSON and have something similar to what I got below. The main point is to have all dataseries in the argument of one "q".

{
  "viz": "timeseries",
  "requests": [
    {


      "q": "avg:system.cpu.system{host:host-01}, avg:system.cpu.user{,host:host-01}, avg:system.cpu.iowait{host:host-01}, avg:system.cpu.stolen{host:host-01}, avg:system.cpu.idle{host:host-01}",
    },

      "type": "area"
  ],
  "events": []
}

Tuesday, January 06, 2015

Count processes per state per application

In previous posts (here and here) I discussed how to count thread in a given state for a give process. Recently, I had another problem - I needed to count number of processes per application per state. My previous commands wouldn't work, so I wrote an alternative version.

while [ 1 ];
do
    date;
    cat /proc/loadavg;
    ps -Leo state,args |
     awk ' $1 ~ /(D|R)/ {state[$0]++} \
      END{ for (j in state) {printf "%s - %d\n", j, state[j]}}' |
      sort -k 2;
    echo "---";
    sleep 5;
done 

There is not PID and args are included in the output list as a whole. worried for number of processes.

One more thought. Dropping "$1 ~ /(D|R)/" can be useful in case of problem with total number of processes. But then the whole command should be a bit modified, so the results are sorted by number of processes. Simplified version would look like this one:

while [ 1 ];
do
    ps -Leo state,args |
     awk ' $1 ~ /(D|R)/ {state[$0]++} \
      END{ for (j in state) {printf "%d - %s\n", state[j], j}}' |
      sort -n;
    echo "---";
    sleep 5;
done 

Tuesday, December 30, 2014

What is the (UNIX) load?

The "load" is use widely to describe stress/work applied onto a UNIX system. The simple rule is "lower than better". In the older days of uniprocessor machine load 1 was  kind of a borderline. In the new brave world of multi-core/processor machines load 1 means nothing.  Many people suggests that load equal or lower to number of processors/cores is good. That sound sensible, but not always is accurate.
Why? To answer that we have to comeback to question asked in the subject.

What is the "load"?

The load as the exponentially damped/weighted moving average of the number of processes, including threads, using or waiting for CPU and, at least at Linux, in uninterruptible sleep state in last 1, 5 and 15 minutes (see Wikipedia). The last part means that all processes/threads waiting for a disk (or other I/O device) will increase the load, without increasing a CPU usage. It leads to situation when the load lower than number of core/processes is danger. Let imagine few processes trying to dump important information on disks. Especially if all interrupts have affinity to one processor only (see this post) or just data are store in many small files. On the other hand, machine with very high load might be very responsive. Plenty of processes waiting to write information onto a disk not using a lot of memory and CPU in the same time. Just look at this picture:



If you want to know even more details of how the load is actually calculated
read this impressive white paper.


Links:
http://en.wikipedia.org/wiki/Load_%28computing%29
http://www.teamquest.com/pdfs/whitepaper/ldavg1.pdf
http://larryn.blogspot.co.uk/2013/05/cpu-affinity-interrupts-and-old-kernel.html

Saturday, December 06, 2014

Install CyanogenMod at Nook HD+

Recently I decided to try the new Cyanomogen (CM11) on my Nook HD+. Initial reading indicated that I had to reinstall using Recovery rather than internal updater. I tried to login to Recovery so much, that I recovered official B&N OS which replaced CM.

I needed to start from beginning. I did some research and found that post. It looked good so I gave it a try. First download ClockworkMod attached to the post, but later I downloaded latest CM snapshot from  there and  added  Google Apps for CM11 from there. I put everything as described on SD Card and kicked off installation. It flew like an Albatross. (To be honest I don't know why I did write Albatross - maybe because of this?) 



Anyway CM11 works good at Nook HD+.

Links:

  • http://wiki.cyanogenmod.org/w/Ovation_Info
  • http://download.cyanogenmod.org/?type=snapshot&device=ovation
  • http://wiki.cyanogenmod.org/w/Google_Apps
  • http://forum.xda-developers.com/showpost.php?p=42406126&postcount=7
  • http://forum.xda-developers.com/attachment.php?attachmentid=2849350&d=1405272804

Sunday, November 23, 2014

More fabric as a library

Recently I had to prepare a tool doing some remote commands, so of course I decided to use fabric, but I have big problem to control hosts. I remembered that I had written a short article on Fabric in here some time ago. But it didn't help. I asked on the Fabric mailing lists, but there was no help.


Manual host name control

In this tool I didn't need to run many parallel SSH connection, so I decided to control remote host name from inside the loop in the function my setting env.host_string each time (this is very useful functionality). Like in following example:


#!/usr/bin/env python
"""Example code to use Fabric as a library. 
It shows how to set up host manually.
 
Author: Wawrzek Niewodniczanski < main at wawrzek dot name >
"""
 
# import sys to deal with scripts arguments and of course fabric 
import sys
import fabric
from fabric.api import run, hide, env

env.hosts = ['host1', 'host2'] 
 
# Main function to run remote task 
def run_task(task='uname'):
    """run_task([task]) -
    runs a command on a remote server. If task is not specify it will run 'uname'."""
    # hide some information (this is not necessary).
    with hide('running', 'status'):
        run(task) 
 
# Main loop
# take all arguments and run them on all hosts specify in env.hosts variable
# if not arguments run 'uname' 
if len(sys.argv) > 1:
    tasks = sys.argv[1:]
    for task in tasks:
        for host in env.hosts:
            env.host_string = host
            run_task(task)
else:
    for host in env.hosts:
        run_task() 



Fabric in full control

The problem bugged me since then. Yesterday I found some of my old code. Analysed it and quickly found small, but profound difference with mu recent fabric usage. Rhe code above called the run_task function wrongly. Rather than dealt it in the normal way I supposed to use execute.

#!/usr/bin/env python
"""Example code to use Fabric as a library. 
It shows how to set up host manually.
 
Author: Wawrzek Niewodniczanski < main at wawrzek dot name >
"""
 
# import sys to deal with scripts arguments and of course fabric 
import sys
import fabric
from fabric.api import run, hide, env, execute

env.hosts = ['host1', 'host2'] 
 
# Main function to run remote task 
def run_task(task='uname'):
    """run_task([task]) -
    runs a command on a remote server. If task is not specify it will run 'uname'."""
    # hide some information (this is not necessary).
    with hide('running', 'status'):
        run(task) 
 
# Main loop
# take all arguments and run them on all hosts specify in env.hosts variable
# if not arguments run 'uname' 
if len(sys.argv) > 1:
    tasks = sys.argv[1:]
    for task in tasks:
        execute(run_task, task)
else:
    execute(run_task)


Links:

http://www.fabfile.org/
http://larryn.blogspot.co.uk/2012/11/fabric-as-python-module.html
http://lists.nongnu.org/archive/html/fab-user/2014-10/msg00002.html

Thursday, August 21, 2014

(w)dstat

wdstat

In my .profile (on CentOS 5, just in case there were some changes in dstat) I have following alias to dstat (wdstat stands for Wawrzek's dstat):

alias wdstat="dstat -lcpymsgdn 5"

Where the options stands for:
  • -l  - UNIX load (1m   5m  15m) load average in 1, 5 and 15 minutes, respectively;
  • -c - cpu stats (usr sys idl wai hiq siq) percent of time spent in user and system space, idle, waiting on resource,  serving interrupts and softirqs (software interrupts);
  • -p - process stats (run blk new) number of running, blocked and newly created processes;
  • -y - system stats (int   csw) - number of interrupts and context switches;
  • -m - memory stats (used  buff  cach  free) amount of memory used by processes, disk buffers, disk cache and free;
  • -s - swap stats (used  free) - amount of used and free swap space;
  • -g - page stats (in   out) number of page put in and out from swap;
  • -d -disk stats (read  writ) - number of reads and writes from all disks;
  • -n -network stats (recv  send) number of received and send network packages;

Further reading: