obs_admin
#obs_admin is a command-line tool used on the back-end server(s) to manage running services, submit maintenance tasks, and debug problems. It should be only used by experienced admins.
It has built-in help which you can display with obs_admin --help.
Options to control the running services:
Job Controlling =============== --shutdown-scheduler <architecture> Stops the scheduler nicely with dumping out its current state for fast startup. --check-project <project> <architecture> --check-project <project> <repository> <architecture> --check-all-projects <architecture> Check status of a project and its repositories again --deep-check-project <project> <architecture> --deep-check-project <project> <repository> <architecture> Check status of a project and its repositories again This deep check also includes the sources, in case of lost events. --check-package <project> <package> <architecture> Check status of a package in all repositories --publish-repository <project> <repository> Creates an event for the publisher. The scheduler is NOT scanning for new packages. The publisher may skip the event, if nothing has changed. Use --republish-repository when you want to enforce a publish. --unpublish-repository <project> <repository> Removes the prepared :repo collection and let the publisher remove the result. This is also updating the search database. WARNING: this works also for locked projects! --prefer-publish-event <name> prefers a publish event to be next. <name> is the file name inside of the publish event directory. --republish-repository <project> <repository> enforce to publish a repository --rebuild-full-tree <project> <repository> <arch> rebuild the content of :full/ directory --clone-repository <source project> <source repository> <destination repository> --clone-repository <source project> <source repository> <destination project> <destination repository> Clone an existing repo into another existing repository. Usefull for creating snapshots. --rescan-repository <project> <repository> <architecture> Asks the scheduler to scan a repository for new packages and add them to the cache file. --force-check-project <project> <repository> <architecture> Enforces the check of an repository, even when it is currently blocked due to amount of calculating time. --create-patchinfo-from-updateinfo creates a patchinfo submission based on an updateinfo information.
Options for maintenance are:
Maintenance Tasks ================= Note: the --update-*-db calls are usually only needed when corrupt data has been created, for example after a file system corruption. --update-source-db [<project>] Update the index for all source files. --update-request-db Updates the index for all requests. --remove-old-sources <days> <y> (--debug) WARNING: this is an experimental feature atm. It may trash your data, but you have anyway a backup, right? remove sources older than <x> days, but keep <y> number of revisions --debug for debug output
Options for debugging:
Debug Options ============= --dump-cache <project> <repository> <architecture> Dumps out the content of a binary cache file. This shows all the content of a repository, including all provides and requires. --dump-state <architecture> --dump-project-from-state <project> <arch> dump the state of a project. --dump-relsync <file> To dump content of :relsync files. --set-relsync <file> <key> <value> Modify key content in a a :relsync file. --check-meta-xml <project> --check-meta-xml <project> <package> Is parsing a project or package xml file and puts out error messages, in case of errors. --check-product-xml <file> Is parsing a product xml file and puts out error messages, in case of errors. It does expand all xi:include references and validates the result. --check-product-group-xml <file> Is parsing a group xml file from a product definition and puts out error messages, in case of errors. --check-kiwi-xml <file> --check-kiwi-xml <project> <package> Is parsing a KIWI xml file and puts out error messages, in case of errors. --check-constraints <file> --check-constraints <project> <package> Validates a _constraints file --check-pattern-xml <file> Is parsing a pattern xml file and puts out error messages, in case of errors. --check-request-xml <file> Is parsing a request xml file and puts out error messages, in case of errors. --parse-build-desc <file> [<arch> [<buildconfigfile>]] Parse a spec, dsc or KIWI file with the Build script parser. --show-scheduler-architectures Show all architectures which are configured in configuration.xml to be supported by this instance. --show-delta-file <file> Show all instructions of a OBS delta file --show-delta-store <file> Show delta store statistics
osc
#The osc command-line client is mainly used by developers and packagers. But for some tasks, admin people also need this tool. It too has builtin help: use osc --help. The tool needs to be configured first to know the OBS API URL and your user details.
To configure the osc tool the first time you need to call it with
osc -A <URL to the OBS API> For example: osc -A https://api.testobs.org
Follow the instructions on the terminal.
The password is stored in clear text in the .oscrc file by default, so you need to give this file restrictive access rights, only read/write access for your user should be allowed. osc allows to store the password in other ways (in keyrings for example) and may use different methods for authentication like Kerberos see Section 3.7.7.2, “Kerberos”
For the admins the most important osc subcommands are:
meta - to create or update projects or package data
API - to read and write online configuration data
osc meta
Subcommand #meta: Show meta information, or edit it Show or edit build service metadata of type <prj|pkg|prjconf|user|pattern>. This command displays metadata on buildservice objects like projects, packages, or users. The type of metadata is specified by the word after "meta", like e.g. "meta prj". prj denotes metadata of a buildservice project. prjconf denotes the (build) configuration of a project. pkg denotes metadata of a buildservice package. user denotes the metadata of a user. pattern denotes installation patterns defined for a project. To list patterns, use 'osc meta pattern PRJ'. An additional argument will be the pattern file to view or edit. With the --edit switch, the metadata can be edited. Per default, osc opens the program specified by the environmental variable EDITOR with a temporary file. Alternatively, content to be saved can be supplied via the --file switch. If the argument is '-', input is taken from stdin: osc meta prjconf home:user | sed ... | osc meta prjconf home:user -F - For meta prj and prjconf updates optional commit messages can be applied with --message. When trying to edit a non-existing resource, it is created implicitly. Examples: osc meta prj PRJ osc meta pkg PRJ PKG osc meta pkg PRJ PKG -e Usage: osc meta <prj|prjconf> [-r|--revision REV] ARGS... osc meta <prj|pkg|prjconf|user|pattern> ARGS... osc meta <prj|pkg|prjconf|user|pattern> [-m|--message TEXT] -e|--edit ARGS... osc meta <prj|pkg|prjconf|user|pattern> [-m|--message TEXT] -F|--file ARGS... osc meta pattern --delete PRJ PATTERN osc meta attribute PRJ [PKG [SUBPACKAGE]] [--attribute ATTRIBUTE] [--create|--delete|--set [value_list]] Options: -h, --help show this help message and exit --delete delete a pattern or attribute -s ATTRIBUTE_VALUES, --set=ATTRIBUTE_VALUES set attribute values -R, --remove-linking-repositories Try to remove also all repositories building against remove ones. -c, --create create attribute without values -e, --edit edit metadata -m TEXT, --message=TEXT specify log message TEXT. For prj and prjconf meta only -r REV, --revision=REV checkout given revision instead of head revision. For prj and prjconf meta only -F FILE, --file=FILE read metadata from FILE, instead of opening an editor. '-' denotes standard input. -f, --force force the save operation, allows one to ignores some errors like depending repositories. For prj meta only. --attribute-project include project values, if missing in packages --attribute-defaults include defined attribute defaults -a ATTRIBUTE, --attribute=ATTRIBUTE affect only a given attribute
osc api
Subcommand #api: Issue an arbitrary request to the API Useful for testing. URL can be specified either partially (only the path component), or fully with URL scheme and hostname ('http://...'). Note the global -A and -H options (see osc help). Examples: osc api /source/home:user osc api -X PUT -T /etc/fstab source/home:user/test5/myfstab osc api -e /configuration Usage: osc api URL Options: -h, --help show this help message and exit -a NAME STRING, --add-header=NAME STRING add the specified header to the request -T FILE, -f FILE, --file=FILE specify filename to upload, uses PUT mode by default -d STRING, --data=STRING specify string data for e.g. POST -e, --edit GET, edit and PUT the location -X HTTP_METHOD, -m HTTP_METHOD, --method=HTTP_METHOD specify HTTP method to use (GET|PUT|DELETE|POST)
The online API documentation is available at https://build.opensuse.org/apidocs (https://build.opensuse.org/apidocs)
Some examples for admin stuff:
# Read the global configuration file osc api /configuration # Update the global configuration osc api /configuration -T /tmp/configuration.xml # Read the distributions list osc api /distributions # Udate the distributions list osc api /distributions -T /tmp/distributions.xml # retrieve statistics osc api /statistics/latest_added
Using another Open Build Service as source for build targets is the easiest way to start. The advantage is, that you save local resources and you do not need to build everything from scratch. The disadvantage is that you depend on the remote instance, if it has a downtime your instance cannot do any builds for these targets, if the remote admins decide to remove some targets you cannot use them anymore.
The easiest way to interconnect with some of the public OBS instances is to use the Web UI. You need to log in with an administrator account of your instance to do this. On the start page of an administrator account you will find a Configuration link. On the Configuration page you find an Interconnect tab on the top, use this and select the public side you want.
If you want to connect to a not listed instance, you can simple create a remote project using the osc meta prj command. A remote project differs from a local project as it has a remoteurl tag (see Section 2.4.2, “Project Metadata”).
Example:
<project name="openSUSE.org"> <title>openSUSE.org Project Link</title> <description> This project refers to projects hosted on the openSUSE Build Service </description> <remoteurl>https://api.opensuse.org/public</remoteurl> </project>
Sending this via osc to the server:
osc meta prj -m "add openSUSE.org remote" -F /tmp/openSUSE.org.prj
With local hosted distributions packages you are independent from other parties. On sides with no or bad internet connections, this is the only way to go. You do not need to build the distribution packages on your instance, you can use binary packages for this. Here are different ways to get a local build repository:
mirror a distribution from another OBS instance
mirror a binary distribution from a public mirror and import the binaries
use already existing local install repositories (for example, from an SMT instance)
use the install media to import the binaries
These tasks need to be run on the obs back-end. In a partition setup you need to run it on the partition which would the owner for the project.
Mirroring a project from a remote OBS instance can be done with the obs_mirror_project script which is supplied with the obs sources and via the obs-utils package. You can get the latest version from GitHub: https://raw.githubusercontent.com/openSUSE/open-build-service/master/dist/obs_mirror_project.
The usage:
____________________________________________________________________________________ Usage: obs_mirror_project.rb -p PROJECT -r REPOSITORY [-a ARCHITECTURE] [-d DESTINATION] [-A APIURL] [-t] [-v] Example: (mirror openSUSE 13.1 as base distro) obs_mirror_project -p openSUSE:13.1 -r standard -a i586,x86_64 ____________________________________________________________________________________ Options help: -p, --proj PROJECT Project Name: eg. openSUSE:13.1,Ubuntu:14.04,etc. -r, --repo REPOSITORY Repository Name: eg. standard,qemu,etc. -a, --arch Architecture Architecture Name: eg. i586,x86_64,etc. -d, --dest DESTINATION Destination Path: eg. /obs Default: PWD (current working directory) -A, --api APIURL OSC API URL :Default: https://api.opensuse.org -t, --trialrun Trial run: not executing actions -v, --verbose Verbose -h, --help Display this screen
This is the same procedure for all local sources. If you have a
local copy of a distribution, you can either use symbolic links to the
binary packages or copy them in a directory on the back-end repo server
under the /srv/obs/build
directory. You should
follow the common name schema for build repository here. As first step you
should create an empty project for the distribution, you can use the Web UI
or the osc
command-line tool. Then you
add a repository with the name standard
and the build
architectures you want. Here an example project meta file:
<project name="SUSE:13.2"> <title>openSUSE 13.2 build repositories</title> <description>openSUSE 13.2 build repositories</description> <person userid="Admin" role="maintainer"/> <build> <disable repository="standard"/> </build> <publish> <disable/> </publish> <repository name="standard"> <arch>x86_64</arch> <arch>i586</arch> </repository> </project>
After you have created the project with these settings, the /srv/obs/build directory should have a tree for SUSE:13.2:
/srv/obs/ ├── build │ └── SUSE:13.2 │ └── standard │ ├── i586 │ │ ├── :bininfo │ │ └── :schedulerstate │ └── x86_64 │ ├── :bininfo │ └── :schedulerstate
All the directories under /srv/obs/build have to be owned by the obsrun user and group. The obsrun user need write access to them. If not the scheduler process will crash on your instance.
You need to import the project configuration as well, you can get them for example from the openSUSE Build Service.
osc -A https://api.opensuse.org meta prjconf openSUSE:13.2 >/tmp/13.2.prjconf osc meta prjconf -m 'Original version from openSUSE' SUSE:13.2 -F /tmp/13.2.prjconf
Now you need to create the directory ':full' for the binary sources under each architecture, this should be owned by obsrun too.
testobs:/srv/www/obs/api # mkdir /srv/obs/build/SUSE\:13.2/standard/i586/:full testobs:/srv/www/obs/api # mkdir /srv/obs/build/SUSE\:13.2/standard/x86_64/:full testobs:/srv/www/obs/api # chown obsrun:obsrun \ /srv/obs/build/SUSE\:13.2/standard/i586/:full testobs:/srv/www/obs/api # chown obsrun:obsrun \ /srv/obs/build/SUSE\:13.2/standard/x86_64/:full
Now you can copy (or link) all binary packages for the architecture in the :full directory. You need the architecture specific package and the noarch packages as well.
If you import packages for enterprise distributions like SLES12 you also need the packages from the SDK. Maybe you need packages from add-on products as well, depending what software you want build.
Finally you should trigger a rescan for the project on the back-end server using obs_admin:
testobs # obs_admin --rescan-repository SUSE:13.2 standard i586 testobs # obs_admin --rescan-repository SUSE:13.2 standard x86_64
This reads all packages and creates the dependency tree.
Source Services are tools to validate, generate or modify sources in a trustable way. They are designed as smallest possible tools and can be combined following the powerful idea of the classic UNIX design.
Design goals of source services were:
server side generated files must be easy to identify and must not be modifiable by the user. This way other users can trust them to be generated in the documented way without modifications.
generated files must never create merge conflicts
generated files must be a separate commit to the user change
services must be runnable at any time without user commit
services must be runnable on server and client side in the same way
services must be designed in a safe way. A source checkout and service run must never harm the system of a user.
services shall be designed in a way to avoid unnecessary commits. This means there shall be no time-dependent changes. In case the package already contains the same file, the newly generated file must be dropped.
local services can be added and used by everybody.
server side services must be installed by the admin of the OBS server.
services can be defined per package or project wide.
Source Services may be used to validate sources. This can happen per package, which is useful when the packager wants to validate that downloaded sources are really from the original maintainer. Or validation can happen for an entire project to apply general policies. These services cannot get skipped in any package
Validation can happen by validating files (for example using the
verify_file
or source_validator
service. These services just fail in the error case which leads to the build
state "broken". Or validation can happen by redoing a certain action and
store the result as new file as download_files
is doing.
In this case the newly generated file will be used instead of the committed
one during build.
Each service can be used in a special mode defining when it should run and how to use the result. This can be done per package or globally for an entire project.
The default mode of a service is to always run after each commit on the server side and locally before every local build.
trylocal
Mode #The trylocal mode is running the service locally when using current osc versions. The result gets committed as standard files and not named with _service: prefix. Additionally the service runs on the server by default, but usually the service should detect that the result is the same and skip the generated files. In case they differ for any reason (because the webui or API was used for example) they be generated and added on the server.
localonly
Mode #The localonly mode is running the service locally when using current osc versions. The result gets committed as standard files and not named with _service: prefix. The service is never running on the server side. It is also not possible to trigger it manually.
serveronly
Mode #The serviceonly mode is running the service on the server only. This can be useful, when the service is not available or can not work on developer workstations.
buildtime
Mode #The service is running inside of the build job, for local and server side builds. A side effect is that the service package is becoming a build dependency and must be available. Every user can provide and use a service this way in their projects. The generated sources are not part of the source repository, but part of the generated source packages. Network access is not be available when the workers are running in a secure mode.
disabled
Mode # The disabled mode is neither running the service locally or on the
server side. It can be used to temporarily disable the service but keeping
the definition as part of the service definition. Or it can be used to
define the way how to generate the sources and doing so by manually calling
osc service runall
The result will
get committed as standard files again.
The called services are always defined in a
_service
file. It is either part of the package sources
or used project-wide when stored inside the _project
package.
The _service file contains a list of services which get called in this order. Each service may define a list of parameters and a mode. The project wide services get called after the per package defined services. The _service file is an xml file like this example:
<services> <service name="download_files" mode="trylocal" /> <service name="verify_file"> <param name="file">krabber-1.0.tar.gz</param> <param name="verifier">sha256</param> <param name="checksum">7f535a96a834b31ba2201a90c4d365990785dead92be02d4cf846713be938b78</param> </service> <service name="update_source" mode="disabled" /> </services>
This example downloads the files via download_files service via the given
URLs from the spec file. When using osc this file gets committed as part of
the commit. Afterwards the krabber-1.0.tar.gz file will always be compared
with the sha256 checksum. And last but not least there is the
update_source
service mentioned, which is usually not
executed. Except when osc service runall
is called,
which will try to upgrade the package to a newer source version available
online.
Sometimes it is useful to continue working on generated files
manually. In this situation the _service file needs to be dropped, but all
generated files need to be committed as standard files. The OBS provides the
"mergeservice" command for this. It can also be used via osc by calling
osc service merge
.
The dispatcher takes a job from the scheduler and assign it to a free worker. It tries to share the available build time fair between all the project repositories with pending jobs. To achieve this the dispatcher calculates a load per project repository of the used build time (similar to the system load in Unix operating systems). The dispatcher assigned jobs to build clients from the repository with the lowest load (thereby increasing the its load). It is possible to tweak this mechanism via dispatching priorities assigned to the repositories via the /build/_dispatchpriosAPI call or via the dispatch_adjust array in the BSConfig.pmSection 2.1.2.2, “BSConfig.pm” configuration file.
/build/_dispatchprios
API Call #The /build/_dispatchprios API call allows an Admin to set a priority for defined projects and repositories using the HTML put method. With the HTML get method the current XML priority file can be read.
<dispatchprios> <prio project="ProjectName" repository="RepoName" arch="Architecture" adjust="Number" /> </dispatchprios>
The attributes project, repository and arch are all optional, if for example arch and repository are missing the entry is used for all repositories and architectures for the given project. It is not supported to use regular expressions for the names. The adjust value is taken as logarithmic scale factor to the current load of the repositories during the compare. Projects without any entry get a default priority of 0, higher values cause the matching projects to get more build time.
Example dispatchprios XML file
<dispatchprios> <prio project="DemoProject1" repository="openSUSE_Leap_42.1" adjust="10" /> <prio project="Test1" adjust="5" /> <prio project="Test11" repository="openSUSE_13.2" arch="i586" adjust="-10"/> </dispatchprios>
priority | scale factor | priority | scale factor | |
---|---|---|---|---|
-50 |
100000 |
3 |
0.5 | |
-30 |
1000 |
5 |
0.3 | |
-20 |
100 |
7 |
0.2 | |
-15 |
30 |
10 |
0.1 | |
-10 |
10 |
15 |
0.03 | |
-7 |
5 |
20 |
0.01 | |
-5 |
3 |
30 |
0.001 | |
-3 |
2 |
40 |
0.0001 | |
0 |
1 |
50 |
0.00001 |
dispatch_adjust
Array #With the dispatch_adjust array in the BSConfig.pm file the dispatch priorities of project repositories based on regular expressions for the project, repository name and maybe architecture. Each match will add or subtract a value to the priority of the repository. The default priority is 0, higher values cause the matching projects to get more build time.
Each entry in the dispatch_adjust array has the format
'regex string' => priority adjustment
The full name of a build repository looks like
Project:Subproject/Repository/Architecture Examples: Devel:Science/SLES-11/i586 home:king:test/Leap42/x86_64
If a repository matches a string the adjustment is added to the current value. The final value is the sum of the adjustments of all matched entries. This sum is the same logarithmic scale factor as described in the previous section.
Example dispatch_adjust definition in the BSConfig.pm
our $dispatch_adjust = [ 'Devel:' => 7, 'HotFix:' => +20, '.+:test.*' => -10, 'home:' => -3, 'home:king' => +30, '.+/SLE12-SP2' => -40, ];
The above example could have the following background: All Devel projects should get some higher priority so the developer jobs getting more build time. The projects under HotFix are very important fixes for customers and so they should get a worker as soon as possible. All projects with test in the name get some penalty, also home projects are getting only about half of the build time as a normal project, with the exception of the home project from king, the user account of the boss. The SLES12-SP2 repository is not in real use yet, but if here is nothing else to do build for it as well.
The dispatcher calculates the values form the 'dispatch_adjust' array first, if the same project and repository also has an entry in the dispatchprios XML file, the XML file entry will overwrite the calculated priority. The best practice is to only use one of the methods.
The job of the publisher service is to publish the built packages and/or images by creating repositories that are made available through a web server.
It can be configured to use custom scripts to copy the build results to different servers or do anything with them that comes to mind. These scripts are called publisher hooks.
Hooks are configured via the configuration file /usr/lib/obs/server/BSConfig.pm, where one script per project is linked to the repository that should be run if the project/repository combination is published. It is possible to use regular expressions here.
The script is called by the user
obsrun
with the following
parameters:
information about the project and its repository (for example,
training/SLE11-SP1
)
path to published repository (for example,
/srv/obs/repos/training/SLE11-SP1
)
changed packages (for example, x86 64/test.rpm x86
64/utils.rpm
)
The hooks are configured by adding a hash reference named $publishedhook to the BSConfig.pm configuration file. The key contains the project, and the value references the accompanying script. If the value is written as an array reference it is possible to call the hook with self-defined parameters.
The publisher will add the 3 listed parameters at the end, after the
self-defined parameters
(in /usr/lib/obs/server/BSConfig.pm
):
our $publishedhook = {
"Product/SLES12" => "/usr/local/bin/script2run_sles12",
"Product/SLES11-SP3" => "/usr/local/bin/script2run_sles11",
"Product/SLES11-SP4" => "/usr/local/bin/script2run_sles11",
};
Regular expressions or substrings can be used to define a script for
more than one repository in one project. The use of regular expressions has
to be activated by defining
$publishedhook use regex = 1;
as follows
(in /usr/lib/obs/server/BSConfig.pm
):
our $publishedhook_use_regex = 1;
our $publishedhook = {
"Product\/SLES12" => "/usr/local/bin/script2run_sles12",
"Product\/SLES11.*" => "/usr/local/bin/script2run_sles11",
};
With self defined parameters:
our $publishedhook_use_regex = 1;
our $publishedhook = {
"Product\/SLES11.*" => ["/usr/local/bin/script2run", "sles11", "/srv/www/public_mirror"],
};
The configuration is read by the publisher at startup only, so it has to be restarted after configuration changes have been made. The hook script’s output is not logged by the publisher and should be written to a log file by the script itself. In case of a broken script,this is logged in the publisher’s log file (/srv/obs/log/publisher.log by default):
Mon Mar 7 14:34:17 2016 publishing Product/SLES12 fetched 0 patterns running createrepo calling published hook /usr/local/bin/script2run_sles12 /usr/local/bin/script2run_sles12 failed: 65280 syncing database (6 ops)
Interactive scripts are not working and will fail immediately.
If you need to do a lot of work in the hook script and do not want to block the publisher all the time, you should consider using a separate daemon that does all the work and just gets triggered by the configured hook script.
The scripts are called without a timeout.
The following example script ignores the packages that have changed and copies all RPMs from the repository directory to a target directory:
#!/bin/bash
OBSHOME="/srv/obs"
SRC_REPO_DIR="$OBSHOME/repos"
LOGFILE="$OBSHOME/log/reposync.log"
$DST_REPO_DIR="/srv/repo-mirror"
# Global substitution! To handle strings like Foo:Bar:testing - two
#+double-colons!
PRJ_PATH=${1//:/:\/}
PATH_TO_REPO=$2
rsync -a --log-file=$LOGFILE $PATH_TO_REPO/ $DST_REPO_DIR/$PRJ_PATH/
For testing purposes, it can be invoked as follows:
$ sudo -u obsrun /usr/local/bin/publish-hook.sh Product/SLES11-SP1 \ /srv/obs/repos/Product/SLE11-SP1
The following example script reads the destination path from a parameter that is configured with the hook script:
#!/bin/bash
LOGFILE="/srv/obs/log/reposync.log"
DST_REPO_DIR=$1
# Global substion! To handle strings like Foo:Bar:testing - two
#+double-colons!
PRJ_PATH=${2//:/:\/}
PATH_TO_REPO=$3
mkdir -p $DST_REPO_DIR/$PRJ_PATH
rsync -a --log-file=$LOGFILE $PATH_TO_REPO/ $DST_REPO_DIR/$PRJ_PATH/
For testing purposes, it can be invoked as follows:
$ sudo -u obsrun /usr/local/bin/publish-hook.sh \ /srv/www/public_mirror/Product/SLES11-SP1 \ /srv/obs/repos/Product/SLE11SP1
The following example script only copies packages that have changed, but does not delete packages that have been removed:
#!/bin/bash
DST_REPO_DIR=$1
PRJ_PATH=${2//:/:\/}
PATH_TO_REPO=$3
shift 3
mkdir -p $DST_REPO_DIR/$PRJ_PATH
while [ $# -gt 0 ]
do
dir=(${1//\// })
if [ ! -d "$DST_REPO_DIR/$PRJ_PATH/$dir" ]; then
mkdir -p $DST_REPO_DIR/$PRJ_PATH/$dir
fi
cp $PATH_TO_REPO/$1 $DST_REPO_DIR/$PRJ_PATH/$1
shift
done
createrepo $DST_REPO_DIR/$PRJ_PATH/.
For testing purposes, it can be invoked as follows:
$ sudo -o obsrun /usr/local/bin/publish-hook.sh /srv/www/public_mirror \ Product/SLES11-SP1 /srv/obs/repos/Product/SLE11-SP1 \ src/icinga-1.13.3-1.3.src.rpm x86_64/icinga-1.13.3-1.3.x86_64.rpm \ x86_64/icinga-devel-1.13.3-1.3.x86_64.rpm
The job of the publisher service is to publish the built packages and/or images by creating repositories that are made available through a web server.
The OBS Publisher can be configured to use custom scripts to be called whenever already published packages get removed. These scripts are called unpublisher hooks. Unpublisher hooks are run before the publisher hooks.
Hooks are configured via the configuration file /usr/lib/obs/server/BSConfig.pm, where one script per project is linked to the repository that should be run if the project/repository combination is removed. It is possible to use regular expressions here.
The script is called by the user
obsrun
with the following
parameters:
information about the project and its repository (for example, training/SLE11-SP1)
repository path (for example,
/srv/obs/repos/training/SLE11-SP1
)
removed packages (for example, x86 64/test.rpm x86
64/utils.rpm
)
The hooks are configured by adding a hash reference named $unpublishedhook to the BSConfig.pm configuration file. The key contains the project and the value references the accompanying script. If the value is written as an array reference, it is possible to call the hook with custom parameters.
The publisher adds the three listed parameters at the end, directly after
the custom parameters
(in /usr/lib/obs/server/BSConfig.pm
):
our $unpublishedhook = {
"Product/SLES12" => "/usr/local/bin/script2run_sles12",
"Product/SLES11-SP3" => "/usr/local/bin/script2run_sles11",
"Product/SLES11-SP4" => "/usr/local/bin/script2run_sles11",
};
Regular expressions or substrings can be used to define a script for
more than one repository in one project. The use of regular expressions
needs to be activated by defining
$unpublishedhook use regex = 1;
(in /usr/lib/obs/server/BSConfig.pm
):
our $unpublishedhook_use_regex = 1;
our $unpublishedhook = {
"Product\/SLES12" => "/usr/local/bin/script2run_sles12",
"Product\/SLES11.*" => "/usr/local/bin/script2run_sles11",
};
With custom parameters:
our $unpublishedhook_use_regex = 1;
our $unpublishedhook = {
"Product\/SLES11.*" => [
"/usr/local/bin/script2run", "sles11", "/srv/www/public_mirror"
],
};
The configuration is read by the publisher at startup only, so it has to be restarted after configuration changes have been made. The hook script’s output is not logged by the publisher and should be written to a log file by the script itself. In case of a broken script, this is logged in the publisher’s log file (/srv/obs/log/publisher.log by default):
Mon Mar 7 14:34:17 2016 publishing Product/SLES12 fetched 0 patterns running createrepo calling unpublished hook /usr/local/bin/script2run_sles12 /usr/local/bin/script2run_sles12 failed: 65280 syncing database (6 ops)
Interactive scripts are not working and will fail immediately.
If you need to do a lot of work in the hook script and do not want to block the publisher all the time, consider using a separate daemon that does all the work and just gets triggered by the configured hook script.
The scripts are called without a timeout.
Reminder: If unpublish hooks and publish hooks are defined, the unpublish hook runs before the publish hook.
The following example script deletes all packages from the target directory that have been removed from the repository.
#!/bin/bash
OBSHOME="/srv/obs"
LOGFILE="$OBSHOME/log/reposync.log"
DST_REPO_DIR="/srv/repo-mirror"
# Global substitution! To handle strings like Foo:Bar:testing - two
#+double-colons!
PRJ_PATH=${1//:/:\/}
PATH_TO_REPO=$2
shift 2
while [ $# -gt 0 ]
do
rm -v $DST_REPO_DIR/$PRJ_PATH/$1 >>$LOGFILE 2>&1
shift
done
For testing purposes, it can be invoked as follows:
$ sudo -u obsrun /usr/local/bin/unpublish-hook.sh \ Product/SLES11-SP1 \ /srv/obs/repos/Product/SLE11-SP1 \ src/icinga-1.13.3-1.3.src.rpm \ x86_64/icinga-1.13.3-1.3.x86_64.rpm \ x86_64/icinga-devel-1.13.3-1.3.x86_64.rpm
The following example script reads the destination path from a parameter that is configured via the hook script:
#!/bin/bash
OBSHOME="/srv/obs"
LOGFILE="$OBSHOME/log/reposync.log"
DST_REPO_DIR=$1
# Global substitution! To handle strings like Foo:Bar:testing - two
#+double-colons!
PRJ_PATH=${1//:/:\/}
PATH_TO_REPO=$2
shift 3
while [ $# -gt 0 ]
do
rm -v $DST_REPO_DIR/$PRJ_PATH/$1 >>$LOGFILE 2>&1
shift
done
For testing purposes, it can be invoked as follows:
$ sudo -u obsrun /usr/local/bin/unpublish-hook.sh \ /srv/www/public_mirror/Product/SLES11-SP1 \ /srv/obs/repos/Product/SLE11SP1 \ src/icinga-1.13.3-1.3.src.rpm \ x86_64/icinga-1.13.3-1.3.x86_64.rpm \ x86_64/icinga-devel-1.13.3-1.3.x86_64.rpm
The OBS has an integrated user and group management with a role based access rights model. In every OBS instance, at least one user need to exist and have the global Admin role assigned. Groups can be defined by the Admin and instead of adding a list of users to a project/package role user can be added to a group and the group will be added to a project or package role.
The OBS role model has one global role: Admin, which can be granted to users. An OBS admin has access to all projects and packages via the API interface and the web user interface. Some menus in the Web UI do not allow changes by an Admin (for example, the Repository menu) as long the Admin is not a Maintainer for the project as well. But the same change can be done via editing the metadata directly. The other roles are specific to projects and packages and can be assigned to a user or a group.
Role | Description | Remarks |
---|---|---|
Maintainer |
Read and write access to projects or packages | |
Bugowner |
Read access to projects or packages |
should be unique per package |
Reader |
Read access to sources | |
Downloader |
Read access to the binaries | |
Reviewer |
Default reviewer for a package or project |
OBS provides its own user database which can also store a password. The authentication to the API happens via HTTP BASIC AUTH. See the API documentation to find out how to create, modify or delete user data. Also a call for changing the password exists.
Users can be added by the maintainer or if registration is allowed via the registration menu on the Web UI. It can be configured that a confirmation is needed after registration before the user may login.
Administrators can create groups, add users to them, remove users from them and give Maintainer rights to users. This way, a maintainer will be able to also add, remove and give maintainer rights to other users.
osc api -d "<group><title><group-title></title><email><group-email></email><maintainer userid="<user-name>"/><person><person userid="<user_name>"/></person></group>' -X PUT "/group/<group-title>"
In certain cases, it might be desirable to show a Gravatar for a group, similar to the users. In order to show a Gravatar, an email address is needed. Therefore, it is necessary that an admin adds an email address to the group through the API. This can be a achieved by
osc api -X POST "/group/<group-title>?cmd=set_email&email=<groups-email-address>"
The proxy mode can be used for specially secured instances, where the OBS web server shall not get connected to the network directly. There are authentication proxy products out there which do the authentication and send the user name via an HTTP header to OBS. Originally, this was developed for IChain - a legacy single login authentication method from Novell. This also has the advantage that the user password never reaches OBS.
The proxy mode can also be used for LDAP or Active Directory, but only for authentication.
With enabled proxy mode the OBS trust the username in the http header. Since this was verified by the Web server and the Web server only forward requests for a verified and authenticated session, this is safe, as long you make sure that the direct web/API interface of the OBS is not reachable from the outside.
With the proxy mode the user still need to be registered in the OBS and all OBS roles and user properties are managed inside the OBS.
The LDAP support was considered experimental and not officially supported. It is officially supported since 2.8.3 release.
Using LDAP or Active Directory as source for user and optional group information in environments which already have such a server has the advantage for the admin people that the user related information only need to be maintained in one place. In the following sections we are writing LDAP, but this includes Microsoft's Active Directory as well. Only in parts where differences exists Active Directory (AD) will be explicit mentioned.
In this mode the OBS contact the LDAP server directly from the OBS API, if the user was found and provides the correct password the user is added transparently to the OBS user database. The password or password hash is not stored in the OBS database. Because the user database password field is mandatory, a random hash is stored instead. The LDAP interface allows to restrict the access to users which are in a special LDAP group. Optional also groups can be discovered from the LDAP server. This can be also filtered.
Before anybody can add a user to a package or project with a role, the user need to had logged in at least one time, since the check for available users is local only. If the LDAP group mode is enabled, LDAP groups are also added transparently, if an existing group on the LDAP server is added to a project or package.
On bigger installations this mode can result in many search requests to the LDAP server and slow down access to projects and packages, because on every role check an LDAP search operation will contact the LDAP server. As alternative method group mirroring was implemented. This allows that the internal OBS group database is updated with the group membership information during the user authentication. All role test are made local against the OBS database and do not need additional LDAPoperations.
The local user group membership in :mirror mode is updated as follows: When the user logins, the user memberOf attributes are parsed and compared with the global OBS grouplist, if a group matches, the user is added, if they are no longer a group member, they are removed. since this maybe a costly operation, depending on the group counts, this is only done on a full login. After a full login the user status is cashed for 2 minutes, if the user do a login during this time, nothing will be checked or updated. Here is a second mechanism to update user membership: If somebody adds a new Group in the OBS, the member attributes of the group are parsed and all current users which are in the local database become members.
Currently the main OBS LDAP configuration is in the
file options.yml
. Beside the settings in that file, the
openLDAP configuration file is also evaluated by the Ruby LDAP
implementation. This configuration file is usually located at
/etc/openldap/ldap.conf
. You can set here additional
TLS/SSL directives like
TLS_CACERT
, TLS_CACERTDIR
and
TLS_REQCERT
. For more information
refer to the openLDAP man page (man ldap.conf
).
When LDAP mode is activated, users can only log in via LDAP. This also
includes existing admin accounts. To make a LDAP user an admin,
use a rake task which can be run on the OBS instance.
For example, to make user tux
,
use:
cd /srv/www/obs/api bundle exec rake user:give_admin_rights tux RAILS_ENV=production
Config item | Description | Values default
| Remarks |
---|---|---|---|
ldap_mode |
OBS LDAP mode on/off |
| |
ldap_servers |
List of LDAP servers |
colon-separated list | |
ldap_max_attempts |
tries to ping LDAP server |
int | |
ldap_search_timeout |
timeout of an LDAP search |
int 0…N |
0 wait for ever |
ldap_user_memberof_attr |
User attribute for Group membership |
|
case sensitive |
ldap_group_member_attr |
Group attribute for members |
| |
ldap_ssl |
use ldaps port and protocol |
| |
ldap_start_tls |
usr Start TLS on LDAP protocol |
:off | |
ldap_port |
LDAP portnumbers |
if not set 389 for LDAP, 636 for LDAPS | |
ldap_referrals |
Windows 2003 AD requires |
| |
ldap_search_base |
company’s LDAP search base for the users who will use OBS |
| |
ldap_search_attr |
user ID attribute |
|
sAMAccountName for AD, uid for openldap |
ldap_name_attr |
Full user name |
| |
ldap_mail_attr |
Attribute for users email |
| |
ldap_search_user |
Bind user for LDAP search |
for example, cn=ldapbind, ou=system, dc=mycompany, dc=com | |
ldap_search_auth |
Password for the ldap_search_user | ||
ldap_user_filter |
Search filter for OBS users |
for example, a group membership, empty all users allowed | |
ldap_authenticate |
How user how the credentials are verified |
|
only use :ldap |
ldap_auth_mech |
Used auth mech |
|
only if local |
ldap_auth_attr |
Used auth attribute for :local |
|
do not use |
ldap_group_support |
Import OBS groups from LDAP |
|
see text |
ldap_group_search_base |
company’s LDAP search base for groups | ||
ldap_group_title_attr |
Attribute of the group name |
| |
ldap_group_objectclass_attr |
Object class for group |
| |
ldap_obs_admin_group |
Group name for OBS Admins |
if set, members of that group become OBS admin role |
Example LDAP section of the options.yml file:
[...] ################## # LDAP options ################## ldap_mode: :on # LDAP Servers separated by ':'. # OVERRIDE with your company's ldap servers. Servers are picked randomly for # each connection to distribute load. ldap_servers: ldap1.mycompany.com:ldap2.mycompany.com # Max number of times to attempt to contact the LDAP servers ldap_max_attempts: 15 # timeout of an ldap search requests to avoid infinitely lookups (in seconds, 0 no timeout) ldap_search_timeout: 5 # The attribute the user member of is stored in (case sensitive !) ldap_user_memberof_attr: memberOf # Perform the group_user search with the member attribute of group entry or memberof attribute of user entry # It depends on your ldap define # The attribute the group member is stored in ldap_group_member_attr: member # If you're using ldap_authenticate=:ldap then you should ensure that # ldaps is used to transfer the credentials over SSL or use the StartTLS extension ldap_ssl: :on # Use StartTLS extension of LDAP ldap_start_tls: :off # LDAP port defaults to 636 for ldaps and 389 for ldap and ldap with StartTLS #ldap_port: # Authentication with Windows 2003 AD requires ldap_referrals: :off # OVERRIDE with your company's ldap search base for the users who will use OBS ldap_search_base: ou=developmentt,dc=mycompany,dc=com # Account name attribute (sAMAccountName for Active Directory, uid for openLDAP) ldap_search_attr: sAMAccountName # The attribute the users name is stored in ldap_name_attr: cn # The attribute the users email is stored in ldap_mail_attr: mail # Credentials to use to search ldap for the username ldap_search_user: "cn=ldapbind,ou=system,dc=mycompany,dc=com" ldap_search_auth: "top secret" # By default any LDAP user can be used to authenticate to the OBS # In some deployments this may be too broad and certain criteria should # be met; eg group membership # # To allow only users in a specific group uncomment this line: ldap_user_filter: (memberof=cn=obsusers,ou=groups,dc=mycompany,dc=com) # # Note this is joined to the normal selection like so: # (&(#{dap_search_attr}=#{login})#{ldap_user_filter}) # giving an ldap search of: # (&(sAMAccountName=#{login})(memberof=CN=group,OU=Groups,DC=Domain Component)) # # Also note that openLDAP must be configured to use the memberOf overlay # ldap_authenticate says how the credentials are verified: # :ldap = attempt to bind to ldap as user using supplied credentials # :local = compare the credentials supplied with those in # LDAP using #{ldap_auth_attr} & #{ldap_auth_mech} # if :local is used then ldap_auth_mech can be # :md5 # :cleartext ldap_authenticate: :ldap ldap_auth_mech: :md5 # This is a string ldap_auth_attr: userPassword # Whether to search group info from ldap, it does not take effect it is not set # Please also set below ldap_group_* configs correctly to ensure the operation works properly # Possible values: # :off disabled # :on enabled; every group member operation ask the LDAP server # :mirror enabled; group membership is mirrored and updated on user login # ldap_group_support: :mirror # OVERRIDE with your company's ldap search base for groups ldap_group_search_base: ou=obsgroups,dc=mycompany,dc=com # The attribute the group name is stored in ldap_group_title_attr: cn # The value of the group objectclass attribute # group for Active Directory, groupOfNames in openLDAP ldap_group_objectclass_attr: group # The LDAP group for obs admins # if this group is set and a user belongs to this group they get the global admin role # ldap_obs_admin_group: obsadmins
The LDAP mode has 2 methods to check authorization:
LDAP bind method. With the provided credentials, an LDAP bind request is tried.
Local method. The provided credentials checked locally against the content of the userPassword attribute.
The local method should be not used, since the userPassword attribute in most LDAP installations will not be available until you are bind with a privilege user.
In OBS you can use single sign on via Kerberos tickets.
OBS Kerberos configuration resides in the options.yml file.
Config item | Description | Example |
---|---|---|
kerberos_keytab |
Kerberos key table: file where long-term keys for one or more principals are stored |
"/etc/krb5.keytab" |
kerberos_service_principal |
Kerberos OBS principal: OBS unique identity to which Kerberos can assign tickets |
"HTTP/hostname.example.com@EXAMPLE.COM" |
kerberos_realm |
Kerberos realm: authentication administrative domain |
"EXAMPLE.COM" |
Example Kerberos section of the options.yml
file:
[...] ################## # Kerberos options ################## kerberos_mode: true kerberos_keytab: "/etc/krb5.keytab" kerberos_service_principal: "HTTP/hostname.example.com@EXAMPLE.COM" kerberos_realm: "EXAMPLE.COM" [...]
Once Kerberos is enabled, only users with logins that match users known to Kerberos will be able to authenticate to OBS. It is recommended to give admin rights to a matching user before enabling Kerberos mode.
OBS 2.5 provides a mechanism to create tokens for specific operations. This can be used to allow certain operations in the name of a user to others. This is esp. useful when integrating external infrastructure. The create token should be kept secret by default, but it can also be revoked at any time if it became obsolete or leaked.
Tokens belong always to a user. A list of active tokens can received via
osc token
osc token --delete <TOKEN>
A token can be used to execute a source service. The source service has to be setup for the package first, check the source service chapter for this. A typical example is to update sources of a package from git. A source service for that can be setup with
osc add git://....
A token can be registered as generic token, means allowing to execute all source services in OBS if the user has permissions. You can create such a token and execute the operation with
osc token --create
osc token --trigger <TOKEN> <PROJECT> <PACKAGE>
osc api -X POST /trigger/runservice?token=<TOKEN>&project=<PROJECT>&package=<PACKAGE>
You can also limit the token to a specific package. The advantage is that the operation is limited to that package, so less bad things can happen when the token leaks. Also you do not need to specify the package on execution time. Create and execute it with
osc token --create <PROJECT> <PACKAGE>
osc token --trigger <TOKEN>
osc api -X POST /trigger/runservice?token=<TOKEN>
The OBS has an integrated notification subsystem for sending events that are happening in our app through a message bus. We have chosen RabbitMQ (https://www.rabbitmq.com/) as our message bus server technology based on the AMQP (https://www.amqp.org/) protocol.
RabbitMQ claims to be "the most popular open source message broker". Meaning that it can deliver asynchronous messages in many different exchange ways (one to one, broadcasting, based on topics). It also includes a flexible routing system based on queues.
RabbitMQ is lightweight and easy to deploy on premises and in the cloud. It supports multiple messaging protocols too. And can be deployed in distributed and federated configurations to meet high-scale, high-availability requirements.
Currently the RabbitMQ configuration is in the file options.yml
.
All those options there start with the prefix amqp. These configuration items match
with some of the calls we do using the Bunny (http://rubybunny.info/) gem.
Config item | Description |
Values
default
| Remarks |
---|---|---|---|
amqp_namespace |
Namespace for the queues of this instance |
|
Is a prefix for the queue names |
amqp_options |
Connection configuration |
See this guide (http://rubybunny.info/articles/connecting.html) to know which are the parameters allowed. | |
amqp_options[host] |
Server host |
A valid hostname | |
amqp_options[port] |
Server port |
| |
amqp_options[user] |
User account | ||
amqp_options[pass] |
Account password | ||
amqp_options[vhost] |
Virtual host | ||
amqp_exchange_name |
Name for the exchange | ||
amqp_exchange_options |
Exchange configuration |
See this guide (http://rubybunny.info/articles/exchanges.html) to know more about exchanges. | |
amqp_exchange_options[type] |
Type of comunication for the exchange |
| |
amqp_exchange_options[auto_delete] |
If set, the exchange is deleted when all queues have finished using it |
| |
amqp_exchange_options[arguments] |
More configuration for plugins / extensions | ||
amqp_queue_options |
Queues configuration |
See this guide (http://rubybunny.info/articles/queues.html) to know more about queues. | |
amqp_queue_options[durable] |
Should this queue be durable? |
| |
amqp_queue_options[auto_delete] |
Should this queue be automatically deleted when the last consumer disconnects? |
| |
amqp_queue_options[exclusive] |
Should this queue be exclusive (only can be used by this connection, removed when the connection is closed)? |
| |
amqp_queue_options[arguments] |
Additional optional arguments (typically used by RabbitMQ extensions and plugins) |
Example of the RabbitMQ section of the options.yml file:
[...] # RabbitMQ based message bus # # Prefix of the message bus rooting key amqp_namespace: 'opensuse.obs' # Connection options -> http://rubybunny.info/articles/connecting.html amqp_options: host: rabbit.example.com port: 5672 user: guest pass: guest vhost: /vhost # Exchange options -> http://rubybunny.info/articles/exchanges.html amqp_exchange_name: pubsub amqp_exchange_options: type: topic auto_delete: false arguments: persistent: true passive: true # Queue options -> http://rubybunny.info/articles/queues.html amqp_queue_options: durable: false auto-delete: false exclusive: false arguments: extension_1: blah
Queue Name | Description | Payload |
---|---|---|
__prefix__.package.build_success |
A package build has succeeded |
:repository, :arch, :release, :readytime, :srcmd5, :rev, :reason, :bcnt, :verifymd5, :hostarch, :starttime, :endtime, :workerid, :versrel, :previouslyfailed |
__prefix__.package.build_fail |
A package build has failed |
:repository, :arch, :release, :readytime, :srcmd5, :rev, :reason, :bcnt, :verifymd5, :hostarch, :starttime, :endtime, :workerid, :versrel, :previouslyfailed, :faillog |
__prefix__.package.build_unchanged |
A package build has succeeded with unchanged result |
:repository, :arch, :release, :readytime, :srcmd5, :rev, :reason, :bcnt, :verifymd5, :hostarch, :starttime, :endtime, :workerid, :versrel, :previouslyfailed |
__prefix__.package.create |
A new package was created |
:project, :package, :sender |
__prefix__.package.update |
The package metada was updated |
:project, :package, :sender |
__prefix__.package.delete |
A package was deleted |
:project, :package, :sender, :comment |
__prefix__.package.undelete |
A package was undeleted |
:project, :package, :sender, :comment |
__prefix__.package.branch |
A package was branched |
:project, :package, :sender, :targetproject, :targetpackage, :user |
__prefix__.package.commit |
A package has committed changes |
:project, :package, :sender, :comment, :user, :files, :rev, :requestid |
__prefix__.package.upload |
Sources of a package were uploaded |
:project, :package, :sender, :comment, :filename, :requestid, :target, :user |
__prefix__.package.service_success |
Source service succeeded for a package |
:comment, :project, :package, :sender, :rev, :user, :requestid |
__prefix__.package.service_fail |
Source service failed for a package |
:comment, :error, :project, :package, :sender, :rev, :user, :requestid |
__prefix__.package.version_change |
A package has changed its version |
:project, :package, :sender, :comment, :requestid, :files, :rev, :newversion, :user, :oldversion |
__prefix__.package.comment |
A new comment for the package was created |
:project, :package, :sender, :commenters, :commenter, :comment_body, :comment_title |
__prefix__.project.create |
A new project was created |
:project, :sender |
__prefix__.project.update_project_conf |
The project configuration was updated |
:project, :sender, :files, :comment |
__prefix__.project.update |
A project was updated |
:project, :sender |
__prefix__.project.delete |
A project was deleted |
:project, :comment, :requestid, :sender |
__prefix__.project.undelete |
A project was undeleted |
:project, :comment, :sender |
__prefix__.project.comment |
A new comment for the project was created |
:project, :commenters, :commenter, :comment_body, :comment_title |
__prefix__.repo.packtrack |
Binary was published in the repository |
:project, :repo, :payload |
__prefix__.repo.publish_state |
Publish State of Repository has changed |
:project, :repo, :state |
__prefix__.repo.published |
A repository was published |
:project, :repo |
__prefix__.request.create |
A request was created |
:author, :comment, :description, :number, :actions, :state, :when, :who, :diff (local projects) |
__prefix__.request.change |
A request was changed (admin only) |
:author, :comment, :description, :number, :actions, :state, :when, :who |
__prefix__.request.delete |
A request was deleted |
:author, :comment, :description, :number, :actions, :state, :when, :who |
__prefix__.request.state_change |
The state of a request was changed |
:author, :comment, :description, :number, :actions, :state, :when, :who, :oldstate |
__prefix__.request.review_wanted |
A request requires a review |
:author, :comment, :description, :number, :actions, :state, :when, :who, :reviewers, :by_user, :by_group, :by_project, :by_package, :diff (local projects) |
__prefix__.request.comment |
A new comment for the request was created |
:author, :comment, :description, :number, :actions, :state, :when, :who, :commenters, :commenter, :comment_body, :comment_title, :request_number |
OBS is hiding specific parts/pages of the application from search
crawlers (duckduckgo, google etc.), mostly for performance reasons. Which
user-agent strings are identified as crawlers configured in the file
/srv/www/obs/api/config/crawler-user-agents.json
.
To update that list, you must run the command bundle exec rake
voight_kampf:import_user_agents
in the root directory of your OBS
instance. This downloads the current crawler list of user agents as a JSON
file into the config/
directory of the Rails
application.
If you want to extend or edit this list, switch to the
config/
directory and open the
crawler-user-agents.json
file with the editor of your
choice. The content can look like this:
[ { "pattern": "Googlebot\\/", "url": "http://www.google.com/bot.html" }, { "pattern": "Googlebot-Mobile" }, { "pattern": "Googlebot-Image" }, [...] ]
To add a new bot to this list, a pattern must be defined. This is required to identify a bot. Almost all bots have their own user agent that they are sending to a Web server to identify them. For example, the user agent of the Googlebot looks like this:
Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
To choose the pattern for the new bot, compare the user agent of the bot you want to identify with others and look for a part that is unique (like in the Googlebot example, the part: Googlebot).
Let's assume we want to add the bot Geekobot to the list of bots and the user agent looks like this:
Mozilla/5.0 (compatible; Geekobot/2.1; +https://www.opensuse.org)
Our unique part would be Geekobot. So we add a new entry to the list of bots:
[ { "pattern": "Googlebot\\/", "url": "http://www.google.com/bot.html" }, { "pattern": "Googlebot-Mobile" }, { "pattern": "Googlebot-Image" }, [...] { "pattern": "Geekobot" } ]
You can also use regular expressions in the pattern element.
Save the file and restart the Rails application and the bot Geekobot should be identified properly.