Google StackDriver hack — inserting your legacy logs inside the cloud

Jean-François Marquis
4 min readMay 28, 2019

Starting using Google Cloud Platform in my company (Adeo) i discover an integrated stack with some logging capability i never imagine in legacy datacenter. Everything is logged with a high level of granularity, the GUI is very reactive and the filtering capabilities offer a lot of possibilities to find the right information in a short period of time. I’ve also discovered the sink function which allow me to store for a long time (with an automated lifecycle implemented) datas from Stackdriver to Bigquery. Great! I’ve covered a major part of security requirements (Accountability) in less than 1 hour and everyone is happy (security guys of course, data-scientists who have more data to correlate, data engineer who can use their favorite langage : SQL and as a bonus BI team can produce beautiful reports).

So everything is fine in the world of GCP but as time passes I become more demanding and want to have a consolidated view on all my logs : GCP of course but also on legacy servers. After reading GCP docs I can’t find any solution to gather logs from my legacy system and push it into Stack driver :-(

So I’ve started to look to partner solutions like Blue Medora to cover my prerequisites but it’s too much for my needs (I just need logs). Anyways during this market tour I have an idea : if others product are multi sources can’t we reuse fluentd component which is multi source and can be chained?

At the beginning I was thinking to deploy one fluentd by server in dc and chain legacy fluentd to a VM on GCP with logging agent (google-fluentd) installed (in this case google-fluentd agent is writing to stackdriver). There’s a better way : most of my servers are linux with rsyslog installed, I just have to change the config a little bit to redirect all logs to gcp.

Datacenter side

I have just added a new myrsyslog.conf file in /etc/rsyslog.d (and restart the service)

*.* @myserver:5132

Google Side

In all cases : 1 GCP project with a stackdriver workspace is recommended.

Basic solution : 1 VM.

I have add a new fluentd conf file in : /etc/google-fluentd/config.d/datacenter-syslog.conf (and restart the service)

<source>
@type syslog
port 5132
bind “0.0.0.0”
tag “datacenter-syslog”
protocol_type udp
priority_key severity
<parse>
message_format rfc3164
</parse>
</source>
<source>
@type syslog
port 5132
bind “0.0.0.0”
tag “datacenter-syslog”
protocol_type tcp
priority_key severity
<parse>
message_format rfc3164
</parse>
</source>

The google side is implementing TCP or UDP protocol to collect logs depending on criticality and transport prerequisites (encryption, etc.).

Advanced solution

1 regional instance group manager with autoscaling policy protected by a load-balancer exposing only desired port on TCP and /or UDP

Logs are ingested inside stackdriver and can be view inside the gcp project :

GCE VM Instance -> All instances (or your instance)

Tricks : if you want to print the name of the source machine on Stackdriver UI click on “View Options” in the grey strip -> “Modify Custom Field” and add jsonPayload.host

I have also created a dashboard to monitor log ingestion :

  • cpu load for each vm collecting logs;
  • network traffic;
  • log entry by type (error, warning, etc);

and on graph to monitor stackdriver api calls (protected by quota 1000 api calls per second). In this graph 2000 servers are sending system logs to stackdriver.

If you want to sink data to a bigquery you can do it trough the UI or with a command line :

gcloud logging sinks ${sink_name} bigquery.googleapis.com/projects/${mygcpproject}/datasets/${dataset_name} — log-filter=”resource.type=audited_resource OR resource.type=gce_instance ORresource.type=gcs_bucket ORresource.type=cloudsql_database ORresource.type=bigquery_resource ORresource.type=container ORresource.type=dns_managed_zone ORresource.type=gke_cluster ORresource.type=k8s_cluster ORresource.type=cloud_function ORresource.type=global ORresource.type=api ORresource.type=gce_network ORresource.type=gce_firewall_rule ORresource.type=gce_operation ORresource.type=gce_project ORresource.type=gce_snapshot ORresource.type=security_scanner_scan_config ORresource.type=service_account ORresource.type=pubsub_topic ORresource.type=pubsub_subscription ORresource.type=reported_errors”

To conclude, I discovered and now use a great tool (with a reasonable cost) that allow us to centralized logs for 2000 servers in a couple of days (in addition of all logs already collected in GCP platform). I’m now looking to extend the initial scope to a unique backend for logs and metrics.

--

--