Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

There's the "Plumber Problem[1]" by John Siracusa where you watch a story and see a problem in your specific domain that leads you to mistrust the whole story.

With the 12 factor nonsense, I see the guidance of "Dump unformatted logs into stdout and make it someone else's problem; they'll hadoop or splunk it and everyone wins." There is no magic splunk that absolve you (the developer) of generating useful data to logs, if you expect useful data out of your logs. You can't reverse "cow -> burger -> sewage" and shouldn't expect some magic hadoop to reverse entropy without a ton of work.

So -- I guess the 12 factor app dictum is fine for some trivial "I rewrote twitter in 80 lines of erlang" app, but in my opinion it is a "if we can land a man on the moon we can land a man on the sun" misstatement of complexity in actually getting things done in the real world.

[1] https://hypercritical.co/2023/08/18/the-plumber-problem



Spoken like someone who has never had to work on a Java application with 22 custom log files, not including stdout, stderr, and two separate syslogs.

Yes, if you want useful data from logs you'll need to log useful data, but this is a case where 12 factor is so successful / dominant today I don't think you even know what it's arguing against anymore.


> this is a case where 12 factor is so successful / dominant today I don't think you even know what it's arguing against anymore

Thank you! This is the same thing people miss when they read things like the Agile Manifesto, or about DevOps, or even architectural choices like React's original pitch for one-way data binding - or, dare I say, microservices.

You can read these things today, and see flaws - holes in the argument, mistaken assumptions, or misread them as claiming to be superior to things that have been developed since. No - to understand what they are talking about you need to know what was current common knowledge or accepted wisdom in the world when they were written.


No, writing files is nearly as bad as writing to stdout -- both imply someone else has to parse the output.

If the logs matter -- write them to an API that retains log structure by serializing it.

If the logs don't matter, then they don't matter, but typically the logs are used for all sorts of analytics processes after the fact.

Asking someone else to parse your data is rude.


Again, "this is a case where 12 factor is so successful / dominant today I don't think you even know what it's arguing against anymore"


The twelve factor app says: A twelve-factor app never concerns itself with routing or storage of its output stream. It should not attempt to write to or manage logfiles. Instead, each running process writes its event stream, unbuffered, to stdout. During local development, the developer will view this stream in the foreground of their terminal to observe the app’s behavior.

That guidance is deeply foolish; I have to assume the rest of the guidance is just as silly for anything but a toy application.


Where does it say "unformatted"? All it says is that your logs should be sent to stdout.

You write your logs in whatever structure you want and have something in the runtime environment (k8s, systemd, file redirection in bash whatever) forward them to the eventual storage location. I don't see the issue.


So -- there's a file, that's created by some something in a container emitting a string of bytes. A file must go through a parsing and serialization phase to be useful to a logging infrastructure. Just because you decorate the contents of that file with { and } and other punctuation doesn't make that file a string of json records, until something's read and verified it.

I'm supposed to just load it into memory hoping it's json or avro or whatever? Nope, errors happen; there's a limit to the number of bytes that can be sent from one place to another before the kernel may interrupt it (you're writing an application log and your runtime emits it's own debugging message; they get intermingled because they're both being written to a fifo).

Maybe this is fine (usually it's fine) but eventually it isn't fine. Use a logging library, categories your logs, maybe start off with stdout but eventually put on some big person pants and use a logging api and use the stderr/stdout for high cardinality exceptions, not "all my logs"


Just so I understand, you are saying that the entire concept of the 12 factor app is bogus because it tells you to write unformatted logs to stdout because anything you write to stdout is by its very nature unformatted because even if you're outputting formatted data something else in your program might decide to concurrently write into your stdout buffer while you're writing a log line and that will corrupt your entire data stream irreparably?


I'm saying that, the one thing I know about looks like obvious nonsense -- "write your logs to stdout and let someone else sort it out" -- so I assume the rest of it is equally facile and silly.

So now, when I see a reference to "12 factor app" I assume "ah, a hello world app from someone who's going to leave and let someone else figure out how to make it work in the real world at real loads with real regulatory or compliance needs"

In other words, for me, it's a shibboleth for "toy" or "unserious piece of shit"

There may, in fact, be parts of the guidance that aren't deeply incorrect. But anyone who points at it as a way to do things, without caveats, is suspect.


It's not 'someone else'. It's whatever environment the application ends up running in.

The idea of 12 factor is to define a reasonable interface for your application. The environment doesn't need to worry about providing interfaces to logging services or collecting files from a custom directory, or making sure the application isn't going to fill the disk with log output - if the app is 12 factor, the only thing the environment needs to do is collect stdout.

As a 12 factor app, write such that what you put on stdout is useful for that purpose.

As a system hosting 12 factor apps, collect stdout and put it somewhere useful.

As a developer of a 12 factor app, when you want to see your log output, go wherever the environment puts it and look there.

That's a useful contract that works at scale. Not sure why you think it doesn't.


That's a fascinating way to live.

I'd love to know what other things you saw one part of and then dismissed fully without further investigation.

Are iPhones unacceptable because they suggest you set a 4 digit pin for access? Aeroplane safety demonstrations? Languages with gendered verbs?


12 factor does not say ‘don’t use a logging library’

Use a logging library. Configure it to output well formed and categorized log messages to stdout. Let the runtime environment take that stream of data from there.


It certainly doesn't say "use a logging library"

it says: A twelve-factor app never concerns itself with routing or storage of its output stream. It should not attempt to write to or manage logfiles. Instead, each running process writes its event stream, unbuffered, to stdout. During local development, the developer will view this stream in the foreground of their terminal to observe the app’s behavior.

In staging or production deploys, each process’ stream will be captured by the execution environment, collated together with all other streams from the app, and routed to one or more final destinations for viewing and long-term archival. These archival destinations are not visible to or configurable by the app, and instead are completely managed by the execution environment. Open-source log routers (such as Logplex and Fluentd) are available for this purpose.

The event stream for an app can be routed to a file, or watched via realtime tail in a terminal. Most significantly, the stream can be sent to a log indexing and analysis system such as Splunk, or a general-purpose data warehousing system such as Hadoop/Hive. These systems allow for great power and flexibility for introspecting an app’s behavior over time, including:

This is, at scale, utter horseshit. If it said "do what's expedient, but use a library and be flexible in how you generate logs, but for goodness sake use any of the publicly available libraries to be expedient now and reliable and flexible later" I'd shrug and move on, but it's just absolute garbage. "Do a shitty job and someone else will fix it for you" is not good advice, even if it lets you get to market quickly.


It's clear you haven't actually taken the time to understand what 12 factor is saying, and you're assuming that anything left unsaid is an implied permission to just screw everything up.

A more reasonable reading of 12 factor is to see that since it suggests that the output streams should be gathered up from multiple running instances of the application, and collated together in a system like splunk, that it behooves you as a 12 factor app developer to build your app such that your output stream is amenable to that approach.

I.e., that your output is a stream of structured serialized data.

You haven't clarified what you think is a better practice than this.


I've had the pleasure of dealing with devs who read this and decide it says "emit json formatted text to stdout and let someone else figure it out, just like it says in some web page." The guidance is explicitly "send it to stdout and let someone else figure it out." Which is terrible advice.

The words don't suggest anything else. Maybe your experience suggests maybe you should use a library and reserve the ability to serialize that data to a proper api if necessary, but that's you not the words in the 12 factor app, and some large subset of people just do the minimum necessary to get the ticket off their queue, and if that means "printf log" that's what this advice in this suggests they'll do. It's splunk's problem now!

And my advice is "use a damned logging library; understand the distinction between using a networking stack and API that manages abstractions like event size and retrying and such and stdout, which offers none of that."


Sir, this is a Wendy's.


> It certainly doesn't say "use a logging library"

Right. It's not telling you anything about whether to use a library or not. All it's talking about is where you output your logging streams to, not about how you produce or format those streams. Is it bad advice because it doesn't tell you you should wear sunscreen when you go outside in summer?

(FWIW I haven't seen any logging library add enough value to be worthwhile. Eventually graduating to structured event logs is a good idea, but having started with a logging library doesn't actually get you any closer to that).


What 12 factor is telling you to do is to not build a log file rotator into your application.

It is the least controversial and most broadly beneficial of the 12 factors.

Why do you think it’s foolish?


It says "send it to stdout and let someone else figure it out."

As that someone else, who often has bug reports like "why aren't my log events all in one record in my thing? Because you filled your stdout with newlines!" "Why did my message vanish? It was important!" "because it's malformed json because your runtime had an error; it's over there instead of over here."

Programming's a stack of details; all of which matter. Log using a logging API where available; use stdout/stderr for exception logging only, if at all possible. If you want your logging to be useful, think carefully about how and what you log.


That’s a total misreading of what 12 factor is about.

All it means is to let someone else (other than the program itself) figure out how to get the log messages (whatever you decide they should be) from your process (running wherever it is) to wherever you want to have those logs for inspection.

Do not hardcode into your application the assumption that they are written to a file or sent to a port.

In your application, write them to stdout.

In local dev, stdout will probably go to an interactive console. In a container running in a cloud hosted cluster it will probably go to a networked log aggregator that collects it all into an indexed time oriented data store along with other logs from other running instances.

Because you used stdout, that’s easy to do.

If you wrote them to a file whose name your application manages and changes itself to manage log rotation… either of those is hard to do.


The reason people pay outrageous prices for Splunk is you can get some pretty decent burgers out of sewage if you know what you are doing, and from personal experience the results can be totally game-changing if you are stuck with a mishmash of different log formats and have to support a large distributed application.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: