A naive question regarding Linux system administration, but, still, I HAVE to ask

(2 votes, average: 3.00 out of 5)

Loading...

By Oliver (AKA the Admin) on March 26, 2014 24 comments

in Categories: Just Talking

Hello guys !

(EDIT : I now realize my post was too poorly written, I apologize for this. I do not care about optimizing performance and tuning the system. I do not seek a craftman’s approach. I really am focused on a bottleneck scanning tool, and let the admin do what he wants with the output. Please see the replies I made to the comments at the bottom of this post…)

The rest of this post is about managing a linux server’s resources, so if that doesn’t arouse you (how could it not ?!?), or if that doesn’t interest you, you should just skip this post ^^

Since I gave up on shared hosting and moved on to dedicated hosting, I learned a lot. Both in terms of security and in terms of server managing.
And there’s something that has been troubling me ever since.

When something works too slowly, it’s possible that the problem doesn’t lie in the server’s hardware (CPU, RAM, Disks), it may lie in the server’s configuration. Those evil bottlenecks.
For example, how the limits to Mysql are configured, in terms of open processes, or cache, in number of simultaneously open files… Or for PHP’s handler… Or for the memory limits… Or how Apache’s limits are set, between too generous and too restrictive…

This calls for testing and fixing.
At least, with Mysql (possibly the most common bottleneck), there are various tools like Mysqltuner.
However, what if we don’t want to make things work better, but we want to know if something simply is already at the maximum, and who cares – for the moment – for what reason ?

See… why isn’t there any system-wide checker for bottomnecks ?

A program that would, most simply,
1) COMPARE
. the current state of system values
(read from other monitoring commands)
. versus the upper limit on the system for these values
(mentioned in config files or defined by software defaults)
2) and PRODUCE A REPORT

Once that report is done, it’s all in the admin’s hands.
With that report in hand, he may use it to optimize performance, or use it to fix a problem, or use it to find where a hard to tackle problem finds its origin. I mean : really, who cares, the main point is to produce this report, there’s an infinity of usage scenarios for such a tool.

A program like that would tell us, for instance, let’s imagine…
. “you’re within 99% of the maximum value of 1024 for open-files-limit as defined in /etc/mysql/my.cnf”
. “you’re within 98% of your system’s default value of 250 for MaxClients , which may be adjusted in /etc/apache2/apache2.conf”
. “in the 5 minutes the monitoring lasted you reached 41 times the max value of 1024 for mysql soft nofile, which is a default value but may be adjusted in in /etc/security/limits.conf…” (errr, or adjusted in /etc/mysql/my.cnf, I have doubts now ^^;;)

That would NOT be a tuner. Just a report.
Once again : let us let the admin deal with the results. It’s his business why he uses it and what he does of the results.

This would be only a program that reads what are, system-wide, the maximum values for the system settings (either by default, or, when specified, as mentioned in configuration files), and that monitors the system for a few minutes, to report which of these values are hitting or stuck to their maximum. Reporting the bottlenecks.

Sure, there are tons of elements to scan, so maybe it could be limited to a typical webserver’s profile, Apache/Sql/Php. Still a ton of variables, but now a finite and determined number.

As far as I am concerned, this would be SUPER DUPER useful.

I could give you a personal example…
I don’t know if you noticed it, yesterday Hentairules was offline during two hours ?
That was because I had a bug (something broke, after which, chaaarge!, domino effect !) and I didn’t know which system value(s) had to be adjusted to allow the system to “calm down”, basically. The Mysql engine remained fucked up even after a reboot (my most joyful discovery of the year in server administration : sometimes a reboot doesn’t fix shit at all even if nothing altered the system’s settings, yay ! I’m so glad to know this may not be enough !), and to fix it, mysql required to be allowed to deal with more stuff than before to fix itself… except that I didn’t know WHAT in blazing hell, needed to be changed, I didn’t even understand it was a setting for Mysql that required changing. It took a lot of time to find what had to be updated, I spent hours parsing error logs, googling stuff, and running dozens and dozens and dozens of command lines
And yet, if I had had a bottleneck-checker utility, it would have only taken a minute to spare me many hours, I would have had my report that there was this and that desperately stuck to the maximum, it would have massively eased up the whole system fixing.

Do you see what I mean ?

And yet, a program like that doesn’t exist.

There must be hundreds of good reasons for this not to exist. I wish I could name one

What are your thoughs, about it ?
Does such a program actually exist ? (If you say “yes” and provide me a link, I may have an orgasm, fair warning)
Is it impossible because of the differences between distros ? (To this, I would object there are only few differences between major distros regarding where the system settings are stored, and at worst we could document it in the script’s config)
Any other thoughts ?

(Edit : you’re reading a second version of my post, I brought lots of changes. The first version looked focused on optimizing and monitoring, while this is NOT my purpose at all. I am really focused on bottlenecks searching and reporting. Keep that in mind while reading the comments ^^)

«Renai Sample [English, 238 pictures, complete re-edited Tank], by Homunculus, is REALLY complete, this time :)

Geki!! Monzetsu Operation Plus [English, 209 pictures], by John K Peta (or John K. PE-TA)»

Subscribe

24 Comments

oldest

newest most voted

Inline Feedbacks

View all comments

Foo Bar

10 years ago

http://en.wikipedia.org/wiki/Nagios

0

Reply

Admin

Oliver AKA The Admin

10 years ago

Reply to Foo Bar

That would be for monitoring performance and hardware-to-software interactions, I feel, right ? Still, thanks for the link, I now realize I should have written my question better

0

Reply

Flaming Cheezburger

10 years ago

It’s not exactly what you’re looking for, but have you tried the ‘top’ command from the command line? It’ll give you a breakdown of which running processes are using up the most system resources (cpu, ram, etc.).

Likewise, have you tried tailng your logs (tail -f /var/log/log_name)? You might be able to find some useful clues as to what mysql is doing to screw up so badly.

0

Reply

Admin

Oliver AKA The Admin

10 years ago

Reply to Flaming Cheezburger

top, iotop, mytop, iostat -x 5, vmstat 5, mpstat -P ALL 1, tail -f /var/log/mysql/mysql-slow.log, wc -l /proc/net/ip_conntrack, netstat -nat | awk '{print $6}' | sort | uniq -c | sort -n, perl mysqltuner.pl, ./tuning-primer.sh ?

Yeah, I've vaguely heard of commands like that, I may even have ran them by accident, maybe I was drunk

0

Reply

gammon

10 years ago

Have a look at http://brendangregg.com/linuxperf.html

The infograph on the upper right hand alone should give a nice hint why there is no simple answer to your simple question. It’s because it’s not a simple question but a complex one =)

0

Reply

Admin

Oliver AKA The Admin

10 years ago

Reply to gammon

Oh god, that was a depressing infographic.

I see the idea, yeah. I imagine it would be safer to focus on a LAMP perspective and only monitor the variables associated to the web server activity…

0

Reply

LonesomeAdmin

10 years ago

Like already pointed out above – there are no simple answers to such a question.

When it comes down Servers, everything boils down to “usage scenario”. You can’t cover a “configuration” of your server through a simple “auto-conf” program – there are simply too many factors involved.

The configuration of the services you’re running on your server entirely depends on the particular usage scenario as well as your expectations (i.e. Apache/PHP/MySQL are more mission critical than a, for example, FTP you’re running in the background for easy file-upload). As with everything in Linux you, of course, are expected to do your homework and research on the topics so you can make a well educated decision on the configuration of your machine(s). That continuous “RTFM” is a royal pain in the ass, but it’s for sure the only way by which you really understand what’s going on to draft up the configuration and then give it a try in some live-environment.

There are “benchmarks” which help you in finding out your “boundaries” … like … how many clients request can Apache serve on my hardware and related config before it starts to crap out … or … how many queries per second does MySQL (bad choice by the way, I would kick MySQL into the depths of hell and use MariaDB instead) manage to swallow before the engine starts to throw up … or … have an eye on your memory stats to see how much RAM PHP sucks up while parsing your code (and also, how much time does a SQL query take till the data comes back from the database).

This is where you can lose yourself in fine-tuning to no ends … but … no matter how great your configuration is, the most likely bottleneck is the hardware.

Rule of thumb: The more RAM, the faster I/O and hard drives and the fast the CPU the better the performance you can squeeze out of the system.

If you’re running on a tight RAM budget but you have a lot of concurrent page-views, well, there’s nothing you can really do other than up the RAM.

If you’re running everything off a lousy PATA/SATA hard drive there’s also not much you can really do if the I/O throughput from/to the hard drive is the bottleneck in high load situations – other than switch to faster drives and/or interfaces.

If you’re running a RAID-1 (mirroring your system and “www” directory to a second drive in case one fails) or RAID-5 (to keep your data secured) in software on a CPU that’s more busy with managing the RAID than being busy with running the services you can only switch to a faster CPU or leave RAIDing to a dedicated controller which can do it in hardware on itself.

There are so many variables involved … you can’t really “optimize” crap through a “simple” script – it is a matter of “config has to fit the particular usage scenario”. If you think that all the “big irons” out there ship with a magician doing the “optimizing” work, you’re wrong. The big irons are all hand-tuned to do their assigned task as efficiently as somehow possible … there is no “1337 mah ServA” one-click tool for Crash Test Idiots.

On the level of servers mere mortals are using… the best you can possibly do, if you have zilch ideas about what you’re doing, is to try and see if you find sane numbers from another system matching yours and giving it a trial-and-error whirl to see what problems may arise – of course you’re expected to have made your homework first so you understand the topic at hand and what the numbers do wanna tell you.

Linux, and other UNIX-ish Operating Systems, can be a pain in the butt at creating a sane configuration and tracking down a bottleneck that has just shown its ugly face in the depths of some log, but being able to stick your fingers into the darkest corners of the system is what is literally priceless… try to slap a Windows server into shape where you can do, compared to the openness of Linux, pretty much nothing to tweak the software into shape.

The best “server optimizer” you can possibly get is either yourself (in case you are willing to educate yourself in the matter) or some helpful friend/acquaintance knowing his way around in the depths of a Linux system.

Linux isn’t for free… you can get it for free, but the price you have to pay is the time you spend in understanding the system and how its being configured to work at your expectations within what’s possible on the platform its running on.

0

Reply

Admin

Oliver AKA The Admin

10 years ago

Reply to LonesomeAdmin

Hey Lonesome

Your long reply, as well as the other replies, cruelly showed me I didn't write my post properly. If I had, there would have been far less misunderstandings >_<

To rephrase it, I do not care about optimizing performance.
Really not.
I care about finding the software bottlenecks.

Let us picture ourselves in a situation where hardware is plenty enough and there isn't enough demand to fully use the hardware resources of the server.
There will be, even then, situations where the server fails to give all it can give, because software settings are restricting him.

Bottlenecks.

Too strong restrictions on server variables.

That may be bad settings. But that may normal, legitimate settings for normal situations, that become too low to face a "damn fucking bug why oh god why" situations

See the idea ?

No fine tuning.
No artisan craft knowledge gained through the years.
Not improving. Just finding where.

Simple tracking of actual bottlenecks.

It will be up to the admin to determine if these are normal bottlenecks. Or if those bottlenecks come from heavy duty because of the visitors. Or if those bottlenecks come from yet another problem.

But, at least, a start : a tool that will scan
/etc/mysql/my.cnf
that will be fed in advance the list of default system variables of mysql for the variables not specified in the file above
/etc/apache2/*.conf (or httpd, or the apache folder local of major distros)
that will be fed in advance the list of default system variables of apache for the variables not specified in the file above
…
etcetera, etcetera.

I hope this gives a less confusing idea about what I'm dreaming of ? Would you still give the same answer, to this formulation I made ?

0

Reply

oldbrokenhands

10 years ago

According to this article there are commands that help you with the hardware side of things:
http://www.linux-mag.com/id/7473/

I was probing around on Stack Overflow's website, seems they also have a few SQL gurus there.

Also found this blog about SQL.
http://www.mysqlperformanceblog.com/2011/03/08/ho…

To be honest DBs are not my forte, but it sounds like a neat programming task to tackle. See a need fill a need.

0

Reply

Admin

Oliver AKA The Admin

10 years ago

Reply to oldbrokenhands

Thanks for the links, I learned stuff from them

I still think, though, they don't adress the main concern, tracking bottlenecks from a list of system variables, and they're more optimization-oriented.
The second link could have interested me, if I weren't able to crush my server's CPU with simple select join requests already, those from my hentairules galleries CMS ^^;; (Simple queries storing image metadata in database and linking image identification data to their hardware location.)
Because of these, I have the choice to either put the gallery in maintenance mode while I upload, or give up on placing the galleries in maintenance and impose on you 3-4 seconds loading page timeouts while I upload.

For more details on what I'm after, would you look at the answer I gave to LonesomeAdmin, just above ?

0

Reply

oldbrokenhands

10 years ago

Reply to Oliver AKA The Admin

Yea, sorry about that. All I could find is the term "profiling" There are little snippets of what you want here and there, but nothing comprehensive.

From what I'm gathering you want:
-A program that takes a CNF file
-Parses out data in that file using flags set by the server admin
-Then the data is displayed in a text table or a gui interface that looks like gauges on a power plant monitor. Gauges that can be turned on or off to simplify the view

All I could find were scripts, CLI programs, and various tips, I'll include them below for you to peruse

Sorry, I could not be of more help, happy hunting:
http://blog.jambura.com/2011/09/10/tuning-optimiz… http://www.howtogeek.com/howto/linux/using-a-MySQ… http://serverfault.com/questions/198747/how-to-de… https://drupal.org/node/2601 http://serverfault.com/questions/8169/profiling-a…
http://www.sitepoint.com/the-need-for-speed-profi…

0

Reply

Xenor

10 years ago

I would think that something must exist. On windows, you have Resource Monitor / Performance Monitor that can at least tell you what is bottlenecked (CPU, RAM, Disk, Network). Performance Monitor lets you record over time so you can attempt to isolate problems (i.e….server locks up 10 minutes after boot, etc.).

I have a friend that works for Microsoft on SQL server, but he may have a few clues as he sometimes interacts with sql from linux machines as well.

There is no quick fix button, as some have suggested, but I think the best starting place would be to identify which of your resources are being taxed and then isolate it.

Which sql server are you using?

0

Reply

Admin

Oliver AKA The Admin

10 years ago

Reply to Xenor

LAMP/debian/stable webmin+virtualmin solution.

But we're not into hardware, we're purely into software issues.

This problem could be monitored on a server that's sluggish because of too many visitors, as well as on a super-powerful server that doesn't use all of its resources and still manages to be slowed/stalled/paralyzed by software variable settings that prevent the system from working well….

For more details on what I'm after, would you look at the answer I gave to LonesomeAdmin, just above ?

0

Reply

mtc

10 years ago

Try to ask this question on techsnap http://www.jupiterbroadcasting.com/show/techsnap/

0

Reply

Admin

Oliver AKA The Admin

10 years ago

Reply to mtc

I don't get you, are you suggesting I ask on an internet forum, Mtc ?

0

Reply

Ion

10 years ago

On servers I use FreeBSD which is pretty easy to monitor, but I also use sysctl in both systems. Linux also has descriptors under /proc

I’d also be surprised if mysql wasn’t reporting problems in the logs though.

0

Reply

Admin

Oliver AKA The Admin

10 years ago

Reply to Ion

Logs are for problems. Bottlenecks may not necessarily be seen by the system as a log-worthy problem. Or they may cause problems to other software elements who'll record it elsewhere…

That's also why I would need a string "list of variables : present values VS current system's allowed max values" systematical scanning.

For more details on what I'm after, would you look at the answer I gave to LonesomeAdmin, just above ?

0

Reply

pl0p

10 years ago

The error that the other server returned was:
550 5.1.1 <olivier at hentairules.net>: Recipient address rejected: User unknown in virtual alias table
you have some problem

0

Reply

HurpDurp

10 years ago

Reply to pl0p

It doesn't help that you're STILL spelling Oliver wrong (just like when you were saying it on the IRC yesterday).

Although, I'm not entirely sure why Oliver doesn't have his email set up as a catchall, so that no matter what anyone put he'd get the email nonetheless.

0

Reply

Admin

Oliver AKA The Admin

10 years ago

Reply to HurpDurp

Easy enough, that's because I don't want to enlarge my penis, buy cialis or allow a handsome foreigner to send me gold.
No way in hell am I accepting catchall emails.

To contact me, it's oliver at, not olivier at

0

Reply

HurpDurp

10 years ago

Reply to Oliver AKA The Admin

>Implying you don't get those anyway

0

Reply

asd

10 years ago

Hey Oliver!

I’am a Linux system admin with a an RHCE so I can give you some pointers.

Performance tunning is like craft that is learned as you gain experience doing it so there is no unified tool that manages all aspects of a LAMP (Linux+Apache+MySQL+PHP) server…

It is best to do small adjustments and allow some time to measure and compare the changes.

That said, for the OS side of things you can have an overall picture with nmon http://www.ibm.com/developerworks/aix/library/au-…

For bandwith I like iptraf.

Both can be found on the OS repos.

Also to improve web server performance at virtually no cost try varnish. https://www.varnish-cache.org/

0

Reply

Admin

Oliver AKA The Admin

10 years ago

Reply to asd

Hey Asd

I'll adress the biggest part of your comment first, for more details on what I'm after, would you look at the answer I gave to LonesomeAdmin, a few comments above ?

I really don't care in this post about monitoring and tuning. This, monitoring and tuning, is a craft, indeed, made small improvements after small improvements.

But that's not my point. I can only blame myself for allowing confusion to settle in, I should have wrote my initial post better.

My point is : bottlenecks. Not this. Not that. Not what causes them. Not what they cause.
Not, just them, bottlenecks.

Tracking them. Finding them. Producing a list.
And letting the admin do whatever he wants with it.

That may be on powerful servers. On weak servers burdened by too many traffic. On broken servers because of a mysterious bug. Whatever : this doesn't matter, I know there's an infinity of usage scenarii. But in all cases, if there's a bottleneck, a tool that will track and list them may help and save hours and hours of work

More about what I mean is in my comment reply to Lonesome, scroll up

Varnish doesn't really interest me as it is, my server's already finely setup. Not my me, I know my limits By a professional, it's his job. And I don't want to add yet another layer of complexity that would potentially threaten the stability of the system while it reduces my ability to maintain it all in proper working order. Not blaming the software, blaming my not competent enough self, if the precision is even necessary. (But, heck, I'm learning and improving, at least.)

Nmon : I'll give it a look, thank you, very much !
Is it like a command-line and text-based version of what Munin produces, as I already am using Munin to monitor my server ?
Or a compilation of the outpur of the resources monitor ? I guess I'll see

0

Reply

avatar

10 years ago

You shouldn’t dismiss the output of logs. If you are breaching configured thresholds, logs would be where you go. Sometimes they tell you exactly what it is and what you need to do, most times you have to Google the warnings/errors to identify offending config ‘bottleneck’.

In your example, if you’re 98% of a maximum limit you might actually be running a very good configuration and the system won’t show any problems.

0

Reply