puppet interview questions

Top 15 puppet interview questions

1203 Jobs openings for puppet

Adding a yum repo to puppet before doing anything else

Is there a way to force puppet to do certain things first? For instance, I need it to install an RPM on all servers to add a yum repository (IUS Community) before I install any of the packages.

Source: (StackOverflow)

Do chef and puppet cost money?

I intend to use chef or puppet to do administration (I'm thinking more of chef as it's younger and I get a better feeling about it).

In both home pages I saw there is an "enterprise edition" that costs money and I don't intend to buy anything. What would I miss in chef / puppet if I don't buy them?

What does chef offer that costs money exactly?
What does puppet offer that costs money exactly?

It was not so clear to me from their web site, as it's kind of obscure.

Source: (StackOverflow)

What advantages/features does Puppet or Chef offer over Salt (or vice versa)? [closed]

I am looking at rolling out a new configuration management tool to replace our home-grown solution. The defacto standards are Chef and Puppet, both of which are Ruby-centric (though can be used to deploy non-Ruby environment, obviously). The vast majority of our development is done in Python and our in-house deployment tools make heavy use of Fabric. Therefore I am learning towards Salt since it too is Python, even though it is not as mature as Chef or Puppet. But since I'm not familiar enough with the options, I'm finding it difficult to compare apples-to-apples.

Other than the smaller community, would I be giving up anything signifcant by using Salt rather than Puppet/Chef?


It's been six months since I posted this question. And despite it being closed, it's been viewed over 1,000 times so I thought I'd comment on my experiences.

I eventually decided on Puppet since it had a bigger community. However, it was an immensely frustrating experience, mainly due to the convoluted Puppet configuration syntax. Since I now had a frame of reference to compare the two, I recently took another look at Salt--I'm not going back. It is very, very cool. The things I like best:

  • Seamless integration of both push and pull configuration models. Puppet uses a pull model (node periodically polls server for updates) and has a sister component called Marionette for pushing changes. Both are important to me and I prefer how Salt works. Salt also executes much faster when you have a lot of nodes.

  • Configuration syntax uses YAML, which is just a simple text format that uses indentation and bullet points. You can also choose to use other configuration formats via template. This makes Salt about 10x easier to learn and maintain, in my experience.

  • Python-based. This was the biggest reason I started looking at Salt in the first place. It ended up being one of the more minor reasons I stayed. But if you're a Python shop like us, it makes it easier to develop Salt plugins.

Source: (StackOverflow)

automate dpkg-reconfigure tzdata

I'm using puppet to admin a cluster of debian servers. I need to change the timezone of each machine on the cluster. The proper debian way to do this is to use dpkg-reconfigure tzdata. But I can only seem to change it if I use the dialog. Is there some way to automate this from the shell so I can just write an Exec to make this easy?

If not, I think the next best way would probably be to have puppet distribute /etc/timezone and /etc/localtime with the correct data across the cluster.

Any input appreciated!

Source: (StackOverflow)

Puppet vs Chef, pro and contra from users and use cases [closed]

I already googled and read the "to-puppet-or-to-chef-that-is-the-question" article.

I'm interested in use cases, real world implementations in which people had choosen one or the other on real problems bases.

I'm particularly interested in integration with cobbler issues ( I know puppet is much a standard approach in this direction ); as anybody any experience in cobbler-chef integration ?

Thanks in advance

Source: (StackOverflow)

What should NOT be managed by puppet?

I'm learning my way through configuration management in general and using puppet to implement it in particular, and I'm wondering what aspects of a system, if any, should not be managed with puppet?

As an example we usually take for granted that hostnames are already set up before lending the system to puppet's management. Basic IP connectivity, at least on the network used to reach the puppetmaster, has to be working. Using puppet to automatically create dns zone files is tempting, but DNS reverse pointers ought to be already in place before starting up the thing or certificates are going to be funny.

So should I leave out IP configuration from puppet? Or should I set it up prior to starting puppet for the first time but manage ip addresses with puppet nonetheless? What about systems with multiple IPs (eg. for WAN, LAN and SAN)?

What about IPMI? You can configure most, if not all, of it with ipmitool, saving you from getting console access (physical, serial-over-lan, remote KVM, whatever) so it could be automated with puppet. But re-checking its state at every puppet agent run doesn't sound cool to me, and basic lights out access to the system is something I'd like to have before doing anything else.

Another whole story is about installing updates. I'm not going in this specific point, there are already many questions on SF and many different philosophies between different sysadmins. Myself, I decided to not let puppet update things (eg. only ensure => installed) and do updates manually as we are already used to, leaving the automation of this task to a later day when we are more confident with puppet (eg. by adding MCollective to the mix).

Those were just a couple of examples I got right now on my mind. Is there any aspect of the system that should be left out of reach from puppet? Or, said another way, where is the line between what should be set up at provisioning time and "statically" configured in the system, and what is handled through centralized configuration management?

Source: (StackOverflow)

Why is it so difficult to upgrade between major versions of Red Hat and CentOS?

"Can we upgrade our existing production EL5 servers to EL6?"

A simple-sounding request from two customers with completely different environments prompted my usual best-practices answer of "yes, but it will require a coordinated rebuild of all of your systems"...

Both clients feel that a complete rebuild of their systems is an unacceptable option for downtime and resource reasons... When asked why it was necessary to fully reinstall the systems, I didn't have a good answer beyond, "that's the way it is..."

I'm not trying to elicit responses about configuration management ("Puppetize everything" doesn't always apply) or how the clients should have planned better. This is a real-world example of environments that have grown and thrived in a production capacity, but don't see a clean path to move to the next version of their OS.

Environment A:
Non-profit organization with 40 x Red Hat Enterprise Linux 5.4 and 5.5 web, database servers and mail servers, running a Java web application stack, software load balancers and Postgres databases. All systems are virtualized on two VMWare vSphere clusters in different locations, each with HA, DRS, etc.

Environment B:
High-frequency financial trading firm with 200 x CentOS 5.x systems in multiple co-location facilities running production trading operations, supporting in-house development and back-office functions. The trading servers are running on bare-metal commodity server hardware. They have numerous sysctl.conf, rtctl, interrupt binding and driver tweaks in place to lower messaging latency. Some have custom and/or realtime kernels. The developer workstations are also running a similar version(s) of CentOS.

In both cases, the environments are running well as-is. The desire to upgrade comes from a need for a newer application or feature available in EL6.

  • For the non-profit firm, it's tied to Apache, the kernel and some things that will make the developers happy.
  • In the trading firm, it's about some enhancements in the kernel, networking stack and GLIBC, which will make the developers happy.

Both are things that can't be easily packaged or updated without drastically altering the operating system.

As a systems engineer, I appreciate that Red Hat recommends full rebuilds when moving between major version releases. A clean start forces you to refactor and pay attention to configs along the way.

Being sensitive to business needs of clients, I wonder why this needs to be such an onerous task. The RPM packaging system is more than capable of handling in-place upgrades, but it's the little details that get you: /boot requiring more space, new default filesystems, RPM possibly breaking mid-upgrade, deprecated and defunct packages...

What's the answer here? Other distributions (.deb-based, Arch and Gentoo) seem to have this ability or a better path. Let's say we find the downtime to accomplish this task the right way:

  • What should these clients do to avoid the same problem when EL7 is released and stabilizes?
  • Or is this a case where people need to resign themselves to full rebuilds every few years?
  • This seems to have gotten worse as Enterprise Linux has evolved... Or am I just imagining that?
  • Has this dissuaded anyone from using Red Hat and derivative operating systems?

I suppose there's the configuration management angle, but most Puppet installations I see do not translate well into environments with highly-customized application servers (Environment B could have a single server whose ifconfig output looks like this). I'd be interesting in hearing suggestions on how configuration management can be used to help organizations get across the RHEL major version bump, though.

Source: (StackOverflow)

Why use Chef/Puppet over shell scripts?

New to Puppet and Chef tools. Seems like the job that they are doing can be done with shell scripting. Maybe it was done in shell scripts until these came along.

I would agree they are more readable. But, are there any other advantages over shell scripts besides just being readable?

Source: (StackOverflow)

How can the little guys effectively learn and use Puppet?

Six months ago, in our not-for-profit project we decided to start migrating our system management to a Puppet-controlled environment because we are expecting our number of servers to grow substantially between now and a year from now.

Since the decision has been made our IT guys have become a bit too annoyed a bit too often. Their biggest objections are:

  • "We're not programmers, we're sysadmins";
  • Modules are available online but many differ from one another; wheels are being reinvented too often, how do you decide which one fits the bill;
  • Code in our repo is not transparent enough, to find how something works they have to recurse through manifests and modules they might have even written themselves a while ago;
  • One new daemon requires writing a new module, conventions have to be similar to other modules, a difficult process;
  • "Let's just run it and see how it works"
  • Tons of hardly known 'extensions' in community modules: 'trocla', 'augeas', 'hiera'... how can our sysadmins keep track?

I can see why a large organisation would dispatch their sysadmins to Puppet courses to become Puppet masters. But how would smaller players get to learn Puppet to a professional level if they do not go to courses and basically learn it via their browser and editor?

Source: (StackOverflow)

How can I pre-sign puppet certificates?

Puppet requires certificates between the client (puppet) being managed and the server (puppetmaster). You can run manually on the client and then go onto the server to sign the certificate, but how do you automate this process for clusters / cloud machines?

Source: (StackOverflow)

Options for Multisite High Availability with Puppet

I maintain two datacenters, and as more of our important infrastructure starts to get controlled via puppet, it is important the the puppet master work at the second site should our primary site fail.

Even better would be to have a sort of active / active setup so the servers at the second site are not polling over the WAN.

Are there any standard methods of multi-site puppet high availability?

Source: (StackOverflow)

Fixing services that have been disabled in /etc/default/ with puppet?

I'm using puppet to (theoretically) get npcd to start upon installation, however on Ubuntu, that service comes installed with the default setting in /etc/default/npcd of RUN="no":

 $ cat /etc/default/npcd 
 # Default settings for the NPCD init script.

 # Should NPCD be started? ("yes" to enable)

 # Additional options that are passed to the daemon.
 DAEMON_OPTS="-d -f /etc/pnp4nagios/npcd.cfg"

I would think that this block of puppet config would take care of things:

    service { "npcd":
       enable   => true,
       ensure   => "running",
       require  => Package["pnp4nagios"],

But alas, it doesn't, and short of actually rewriting the file in /etc/default, I'm not sure what to do. Is there a straightforward way to enable the service that I'm not seeing?

For the record, I'm using Ubuntu 12.04.2 and puppet version 3.1.0.

Source: (StackOverflow)

Adding lines to /etc/profile with puppet?

I use puppet to install a current JDK and tomcat.

package {
    [ "openjdk-6-jdk", "openjdk-6-doc", "openjdk-6-jre",
      "tomcat6", "tomcat6-admin", "tomcat6-common", "tomcat6-docs", 
      "tomcat6-user" ]:
    ensure => present,

Now I'd like to add

export JAVA_HOME

to /etc/profile, just to get this out of the way. I haven't found a straightforward answer in the docs, yet. Is there a recommended way to do this?

In general, how do I tell puppet to place this file there or modify that file? I'm using puppet for a single node (in standalone mode) just to try it out and to keep a log of the server setup.

Source: (StackOverflow)

Managing an application across multiple servers, or PXE vs cfEngine/Chef/Puppet

We have an application that is running on a few (5 or so and will grow) boxes. The hardware is identical in all the machines, and ideally the software would be as well. I have been managing them by hand up until now, and don't want to anymore (static ip addresses, disabling all necessary services, installing required packages...) . Can anyone balance the pros and cons of the following options, or suggest something more intelligent?

1: Individually install centos on all the boxes and manage the configs with chef/cfengine/puppet. This would be good, as I have wanted an excuse to learn to use one of applications, but I don't know if this is actually the best solution.

2: Make one box perfect and image it. Serve the image over PXE and whenever I want to make modifications, I can just reboot the boxes from a new image. How do cluster guys normally handle things like having mac addresses in the /etc/sysconfig/network-scripts/ifcfg* files? We use infiniband as well, and it also refuses to start if the hwaddr is wrong. Can these be correctly generated at boot?

I'm leaning towards the PXE solution, but I think monitoring with munin or nagios will be a little more complicated with this. Anyone have experience with this type of problem?

All the servers have SSDs in them and are fast and powerful.

Thanks, matt.

Source: (StackOverflow)

Configuration management: push versus pull based topology

The more established configuration management (CM) systems like Puppet and Chef use a pull-based approach: clients poll a centralized master periodically for updates. Some of them offer a masterless approach as well (so, push-based), but state that it is 'not for production' (Saltstack) or 'less scalable' (Puppet). The only system that I know of that is push-based from the start is runner-up Ansible.

What is the specific scalability advantage of a pull based system? Why is it supposedly easier to add more pull-masters than push-agents?

For example, agiletesting.blogspot.nl writes:

in a 'pull' system, clients contact the server independently of each other, so the system as a whole is more scalable than a 'push' system

On the other hand, Rackspace demonstrates that they can handle 15K systems with a push-based model.

infastructures.org writes:

We swear by a pull methodology for maintaining infrastructures, using a tool like SUP, CVSup, an rsync server, or cfengine. Rather than push changes out to clients, each individual client machine needs to be responsible for polling the gold server at boot, and periodically afterwards, to maintain its own rev level. Before adopting this viewpoint, we developed extensive push-based scripts based on ssh, rsh, rcp, and rdist. The problem we found with the r-commands (or ssh) was this: When you run an r-command based script to push a change out to your target machines, odds are that if you have more than 30 target hosts one of them will be down at any given time. Maintaining the list of commissioned machines becomes a nightmare. In the course of writing code to correct for this, you will end up with elaborate wrapper code to deal with: timeouts from dead hosts; logging and retrying dead hosts; forking and running parallel jobs to try to hit many hosts in a reasonable amount of time; and finally detecting and preventing the case of using up all available TCP sockets on the source machine with all of the outbound rsh sessions. Then you still have the problem of getting whatever you just did into the install images for all new hosts to be installed in the future, as well as repeating it for any hosts that die and have to be rebuilt tomorrow. After the trouble we went through to implement r-command based replication, we found it's just not worth it. We don't plan on managing an infrastructure with r-commands again, or with any other push mechanism for that matter. They don't scale as well as pull-based methods.

Isn't that an implementation problem instead of an architectural one? Why is it harder to write a threaded push client than a threaded pull server?

Source: (StackOverflow)