This helps highlights the main issue I have with python today, and that's running python apps.
Having to ship a compiler to a host|container to build python with pyenv, needing all kinds of development headers like libffi for poetry. The hoopla around poetry init (or pipenv's equivalent) to get a deterministic result of the environment and all packages. Or you use requirements files, and don't get deterministic results.
Or you use effectively an entire operating system on top of your OS to use a conda derivative.
And we still haven't executed a lick of python.
Then there's the rigmarole around getting one of these environments to play nice with cron, having to manipulate your PATH so you can manipulate your PATH further to make the call.
It's really gotten me started questioning assumptions on what language "quick wins" should be written in.
You can use Bazel to build a self-contained Python binary that bundles the interpreter and all its dependencies by using a py_runtime rule[1]. It's fairly straightforward and doesn't require much Bazel knowledge - there are simple examples on GitHub[2].
There are a couple other tools that take the same approach, including PyOxidizer[3], which was written by a Mercurial maintainer.
> As far as I know the only language making static binaries easily is Go, but it was a first class language design principle.
Rust does this as well.
The official high level build tool, `cargo`, uses a declarative TOML file for dependency management and supports lock files for deterministic builds. The default output is a single, statically linked binary.
Rust does depend on libc (like Go) which brings in dynamic linking on some platforms. But Cargo supports easy cross-compilation, and the `x86_64-unknown-linux-musl` target will produce a fully static binary.
> Binaries produced with PyOxidizer are highly portable and can work on nearly every system without any special requirements like containers, FUSE filesystems, or even temporary directory access. On Linux, PyOxidizer can produce executables that are fully statically linked and don’t even support dynamic loading.
Rust can definitely do it, but there still are a lot of gotchas. Many languages can do it, but there are so many pitfalls. For example, a host tz package.
I would argue rust does it much better than go. When you have to resort to hacks like cgo that subtly change the performance and functional characteristics of your program i wouldnt call it "first class". Its good, dont get me wrong, I like how go cross-compiles most things. I wouldn't say its the gold-standard as long as cgo continues to be a thing
Edit: I mention cgo as many who want to cross-compile a statically linked binary may want to interface with other libs via FFI and this is a huge gotcha. It is a bit tangential to strict "static linking binary building".
I decided to drag myself kicking-and-screaming to the 21st century and start writing my handy-dandy utility scripts in python instead of bash. All was well and good until I made them available to the rest of my team, and suddenly I'm in python dependency hell. I search the internet and there are a lot of different solutions but all have their problems and there's no standard answer.
I decided "to heck with it" and went back to bash. There's no built-in JSON parser but I can use 'grep' and 'cut' as well as anyone so the end result is the same. I push it to our repo, I tell coworkers to run it, and I wash my hands of the thing.
jq has been a lifesaver for me parsing json in bash. Of course, it's an external utility not present by default in most systems.
Another thing to consider is more of a middle-ground approach. Most systems do have a python interpreter, so you can use a lot of base python without worrying about dependency hell. I use inline python in bash all the time, e.g.
ls | python -c 'import sys,json;lines=sys.stdin.read();print(json.dumps(list(filter(bool,lines.split("\n"))),sort_keys=True,indent=2))'
You can even use variable substitution, if you surround the python code in double quotes. Even mix f-strings and bash substitution
python -c "print(f'Congrats, ${USER}, you are visitor number ${RANDOM}. This is {__name__}, running in $(pwd)')"
Or use a heredoc to not worry about competing quote chars:
# python << EOPYTHON
print("Congrats, ${USER}")
print("You are visitor ${RANDOM}")
print("This is {__name__}, running in ${pwd}")
print("It's a heredoc to allow both quote characters")
EOPYTHON
Great trick with using the python standard lib! Thanks for posting that.
edit: You probably already know this, but for anyone reading along, piping `ls` is unsafe if you plan to use the paths for anything except for printing them out. A path on linux can contain any byte except for NULL, so when `ls` prints them out, you can get broken behavior if you try to break on newlines.
Just a question - why do you have a dependency hell? You could restrict yourself to the Python standard library, and you would only have one dependency. The Python standard library is much nicer than bash if you need more complex data structures than what bash provides.
"grep" and "cut" are not Bash, they are programs and have dramatically different feature sets between distributions and OSes (grep on MacOS is very different from grep on a modern Linux distribution using GNU Coreutils, and there are many incompatibilities). Many scripts that work on Linux won't work on Mac because of this.
With Bash, your best bet for portability is to run scripts in a Docker container. If you want portable code, you have to bundle your dependencies--there's no free lunch here, including Bash.
When I was at Google I had a similar problem (team wasn't using Blaze). So what I did was to have a wrapper entrypoint around every python entrypoint that would just run that python entrypoint (e.g. foo would execute foo.py). The advantage was that the shell script would first set up a virtual environment for every entrypoint and install all the packages in the requirements.txt that was beside the entrypoint (removing any new ones). Each requirements.txt was compiled from a requirements.in file via pip-sync [1] which meant that devs only had to worry about declaring just the packages they actually directly depended on. Any change to requirements.in would require you to have run pip-sync which wouldn't (by default) upgrade any packages & only lock whatever the current version is (automation unit tests would validate that every requirements.txt matched the requirements.in file).
This didn't solve the multiple versions of python on the host. That was managed by having a bootstrap script written in python2 that would set up the development environment to a consistent state (i.e. install homebrew, install required packages) that anyone wanting to run the tools would run (no "getting started guides") which also versioned itself & was idempotent (generally robust against running multiple times). We also shipped this to our external partners in the factory. Generally worked well as once you ran the necessary scripts once no further internet access was required.
It wasn't easy but eventually it worked super reliably.
I actually did something very similar when my application had to execute a python script on any old box and I was strictly forbidden to make any changes on the host machine. My application refused to start if python 3 wasn't found so I didn't have to deal with that mess. It ran bash, setup the venv, did python-y stuff, clean up the venv, take only pictures leave only footprints.
The caveat is that with mine the venv wasn't destroyed at the end of execution. Instead I put a snapshot of the sha256sum of the requirements.txt file which I double-checked on boot. If that changed then I ran pip-sync.
This was critical for devs because this was the underlying thing for all scripts devs ran (build system, terminal to device, unit tests, etc etc). Startup latency was key & I spent time optimizing that to feel as instant as a native executable unless the virtual environment changed which isolated the expensive part (& generally happened more & more rarely for any given tool as I found the dependency set to mature & freeze pretty quickly).
This had a great side benefit making it super-easy to run the scripts once on an internet-connected device & then use that as the base image for all the factory machines that could then be offline because all the virtual envs had been initialized.
This might seem like lunacy, but I really like/recommend Ammonite instead of Python/Bash.
It's Scala, runs on the JVM, and is perfect for writing scripts. (It has a great built in dependency resolver, I mean it uses Ivy, but it downloads the dep by itself, you just import it via the "maven coordinate" - http://ammonite.io/#IvyDependencies )
It gives you a lot more safety/correctness than Python, and it's a bit simpler to install too. (No need to compile extensions, just get JDK8 and it'll run.)
The solution to this (at least the one we've landed on at work) is to make sure your dependencies are packages in a yum repo you include on your systems. For us, that's a local private yum repo our systems have access to which we package perl Module requirements into that aren't in the public repos. We also include our private libraries there. If the utility script is commonly enough used, we'll make an RPM for is as well, or stick it in one of our general purpose utils RPMs and make sure dependencies are set. If that's done, you don't have to worry about dependencies at all, if not, you might have to manually yum install a few things that are grabbed from our yum repo.
There are lots of ways to handle this problem, but if you're handling lots of systems, you presumably already have a method you use to keep them up to date and secure. You presumably are also installing Python from the system packages (if not, you probably shouldn't be writing system utils in it unless you can ensure it's the same on every system you guys maintain, in which case your dependency problem shouldn't be a problem), so tie into that mechanism. It's a lot easier to reason about when there aren't two competing systems, and presumably you aren't going to do away with the security updates the distro provides.
While I can understand your pain related to dependencies with Python. I still cannot wholeheartedly support such of way. Depending on case bash scripts are valuable and should be utilized instead of using Python. However in some cases this can be painful for other developers, if used in wrong use cases.
I recently received a script from partner company that used such of script for forwarding data to their API. It was quite long and had few dependencies that were not visible until you (stupidly) executed it.
Few random thoughts:
- Bash scripts can be ran in environments where all dependencies to binaries are not met. In these cases the script might cause damage if they expect that everything is available.
- When someone is unexpectedly required to modify the script it can be difficult or cause issues when this is done by inexperienced developer (in this age I wouldn't be surprised)
- If the script uses a program that is required to be certain version for getting wanted results it may cause issues
- The environment where script is ran is usually not a vacuum. Another scripts might change environment variables or change/remove programs in general
While dependencies with Python can cause issues in the future. The trade-off is having some sort of control as long you don't execute other binaries directly.
This is why I've switched to writing "quick wins" in shell [or Go]. It's just so much nonsense that has nothing to do with actually programming. Posix shell can be a bit baroque, but you know that it's not ever going to change and because of that, it's pretty easy to ship to any *nix.
There is the question of the dependencies of a shell script, but I find in practice just checking for deps like `curl` at the beginning leads to be a better user experience. It's unlikely that there is going to be a ton of tools you require, and the tools you do require are probably going to be good about backwards compatibility [curl again as an example].
Except it does all the time. There are innumerable differences between the OSX, BSD, GNU, and other versions of common command line tools. There are plenty of cases where `jq` will or will not be available. Finally there are differences in how `/bin/sh` will interpret things (which there shouldn't be) depending upon underlying shell is running ksh, zsh, bash, dash, etc.
> There are plenty of cases where `jq` will or will not be available.
Sure. The argument is that it's a lot easier for the user of the program to read an error message that says "jq is required. Run apt-get install jq or homebrew install jq" than to fuck around with the python or ruby ecosystem, especially if they don't work in those languages.
> Finally there are differences in how `/bin/sh` will interpret things (which there shouldn't be) depending upon underlying shell is running ksh, zsh, bash, dash
Do you have an example of code that is written to the POSIX standard of shell that runs differently? I only write POSIX shell, and use https://github.com/koalaman/shellcheck to verify that to prevent that exact thing.
I generally agree with your sentiment here, but be careful with assuming bash==bash
There are differences between versions. I can't even remember what they are off the top of my head like I used to, which makes them all the more aggravating to discover again.
But I would recommend sticking to a subset of bash, not any of the new fancy features like 'globstar' which allows recursively globbing.
There are tools to manage these kinds of tests, like bashenv. But you're in the same problem scope at that point.
I much agree with the sister comment and I write my shells for /bin/sh also. There is this wonderful tool called ShellCheck ( https://www.shellcheck.net/ ) that checks that your script is actually POSIX-compliant if it starts with #!/bin/sh
POSIX shell is miserable for programming anything beyond a couple lines. It doesn't even have arrays[1], so you have no available container types within the interpreter itself.
[1] Well, it has $@, which you can use as a general-purpose array with some hacks[2], but that's no way to live.
I don't think it's too unreasonable to assume that you'll be able to find bash anywhere you'd find a general purpose python installation and it has plenty of niceties.
But even the nicest shell doesn't solve the dependency problem like statically compiled programs. If I could take my currently running Python code and produce some artifact that would run with nothing other than the python binary I think we'd be in a much better place.
Ohh apparently all I've needed in my life is zipapps.
Agreed that Bash is (relatively) fine, although error prone. My comment was about POSIX shell, which has none of the features (arrays, [[ instead of [, etc.) that make programming tolerable in Bash.
One drawback is that if you want your Bash script to work on macOS, you need to restrict yourself to features that exist on version 3.2 (from 2006) because that's the latest version that will ever be included on macOS by default.
> If I could take my currently running Python code and produce some artifact that would run with nothing other than the python binary I think we'd be in a much better place.
That's a matter of opinion. I don't find using "$@" to be a big deal in practice.
Let me put it like this: I'm a programmer. I don't mind making programming a bit harder for myself if it means that I get to avoid a lot of the non-programing minutia that's part of a modern interpreted environment.
Also, if you're willing to take a dependency on jq, the issue goes away completely.
Most of the time, you don't need all that, since Python has zipapps. You defined deps, you zip it, you ship to any same os with the same python version. It embeds everything, and just run.
We even how have a nice tool to automate the bundling for you:
I agree, but it's still way easier than the original story, which is the one you also have with PHP, Ruby, JS, etc.
Using an interpretted language always leads to this.
I know no popular interpretted language with a seamless experience to ship a standalone exe.
In fact, Python is probably the one with the best story here, since it has nuitka (https://nuitka.net/), which allows to compile Python code into a fully standalone exe.
But then you need to install a compiler, headers, etc. And no cross compilation of course. Not to mention on Linux, you have to ensure you target the lowest version of libc you can.
You are still very far from Go or Rust, and I'm hoping one day that RustPython will succeed because that would mean an amazing deployment story.
Meanwhile, you trade the ease of deployment of compiled languages for the ease of development of interpretted ones.
I think it's a fare trade for most people: you dev the program much more often that you deploy it.
That doesn't mean we shouldn't work, as a community, to improve the deployment story. It's a serious hindrance.
Rust has a fantastic deployment story: compiling a rust program is super easy, and you can cross compile. Using cargo and rustc is a breath of fresh air compared to any similar experience with C compiling.
So if one day RustPython gets compatible enought with CPython that you can use it as a drop in replacement, you can start creating a tool that compiles any Python VM for any target, and bring along your program with it. Making a standalone version of it would become much easier.
Right now, doing so either requires you to bring in a pre-compile version of cpython for your target (which is what briefcase does) or compile the thing yourself with gcc + headers + deps(which is what nuitka does).
> So if one day RustPython gets compatible enought with CPython that you can use it as a drop in replacement
I don't think this will ever happen unless the community converges on a standard C-extension interface. Presently Python leans so hard on C-extensions, but there is no standard interface--if you're writing a C-extension library, you just depend on whatever obscure corner of CPython that suits your purpose. If you're writing an alternative Python interpreter, you have to implement the entire surface area of CPython, which generally means you must implement CPython exactly and you are severely restricted on the improvements you can make. At that point, why even bother?
Fortunately, I think there are emerging candidate interfaces, but the community needs to either update C-extension packages to use those interfaces or support packages (and maintainers) who already do. https://github.com/pyhandle/hpy.
There are probably only a dozen of c popular extensions that needs to support HPY reach the tipping point of mass adoption: numpy, scipy, pycuda, tensorflow, matplotlib, uvloop, etc. and some db drivers.
The rest is not popular enought to be a blocker. You will hear them scream a lot, but they will be like 0.00001% of the user base, and we can just tell them to stay on CPython with its limitations. They don't lose anything, just not gain anything either.
Those C extensions authors are directly in communication with Python core devs, when they are not core devs themselves, so if HPY is adopted, we can expect a total adoption under 5 years.
Numpy authors already said it would take 1 year to adopt it.
Give the huge number of benefits of HPY, I deeply hope it will be a success.
I'm not sure. I would certainly add psycopg2 to that list, since it's really the only well-supported way to speak to a Postgres database via Python. I imagine other database dialects will have similar issues. And there's probably a whole host of other prominent libraries that we're just not thinking about because we only run into them when we're trying to use something like Pypy, and even then we only run into one or two at a time before giving up and going back to CPython.
youtube-dl for example is distributed as a zipapp and it seems to be distributed just fine. It only requires you to have Python installed on your system, which isn't too burdensome of a requirement on macOS/Linux. On Windows they do actually distribute a Python interpreter.
As usual with extensions, you are not using Python anymore, but a compiled language. To get 100% certainty, you'd need to compile the whole thing.
That being said, a lot of extensions are pre-compiled and provided as wheel, which is the case for tensorflow (I don't know for CUDA, I can't test on a laptop without a GPU).
Let's see what this means:
$ py -m venv test
$ test\Scripts\activate
$ pip install tensorflow
$ code hello_tensor.py
# import tensorflow as tf
# def main():
# with tf.compat.v1.Session() as sess:
# a = tf.constant(3.0)
# b = tf.constant(4.0)
# c = a+b
# print(sess.run(c))
- it will only run on the system this particular wheel has been designed to run on. In my case cp38-win_amd64.
- it will come bundled with tensorflow, which is a behemot, meaning your hello world pyz will around 500 Mo.
- it needs to unzip, so the first run will be REALLY slow
For something like this, I would advice a more generic deployment tool, like fabric 2 if it's remote, or a make-like tool such as doit if it's local only.
Zipapps are an order of magnitude improvement in the Python world, but there are still lots of other major pain points like dependency management and performance which still leaves Python several orders of magnitude behind its competition. Hopefully these things change going forward.
It looks like zipapps built with “shiv” need to extract the contents of the zip file to disk before they can run? Does it delete the extracted files on exit?
If so, the extraction is going to make startup very slow. If not, that’s just messy. Either way, it’s not ideal.
But it beats shipping your entire dev env to the server.
I find it a good compromise. The extraction is done is $HOME/.shiv/{zipappname}_{zipapphash} so it's not a horrible mess. But if your project is big, you do have to clean up the old install because it can eat a significant amount of space.
I probably haven't bought into all of poetry yet but for deployment, I have been using "poetry export" to get the pinned requirements.txt, commit it to the repo and install to a virtualenv. A bit of work to keep it in sync with the poetry dependency file but that's ok.
For PATH with cron or others, I use the full path to the virtualenv such as /path/to/project/.venv/bin/python. The path can be extracted by "which" or "Get-Command" when the venv is active.
Using a python version different from the system python version is probably the messiest part but well, targeting 3.6 is alright.
I do agree it could be better and it's not quite as streamlined as other ecosystems.
Honestly, pip freeze includes the whole content of a venv site-packages, and the exact versions. For most projects, that's equivalent to all the dependancies recursively pins with peotry, although you don't have the clean pyproject-dev-prod/lock file separation.
So a huge number of cases can be handled with just that. It will be "reproducible" enought for a lot of people.
> although you don't have the clean pyproject-dev-prod/lock file separation
That's why I use "poetry export -f requirements.txt > requirements.txt" instead of pip freeze. It only exports prod requirements from the poettry lock file.
Strange, this does seem to work with python3.8 on ubuntu 20.04 (the site-packages shows up in sys.path), but for me in a virtualenv bin/python is a symlink to the system python, so how does python 'know' what path to use? Is there logic baked into the interpreter?
I seem to recall that with python2.7 that calling bin/python in a virtualenv without activating the virtualenv did not used to "work" (i.e. it would use the system packages). Did this change at some point or is my memory just wrong?
If the path to your executable is fixed, just put it in the shebang and you're done - makes everything way more explicit at the cost of some dynamic behavior.
An anecdote: Homebrew uses this method for shipping python executables.
The "production version" of your script should be running in your system environment with system packages. pyenv and friends should be used for testing with different versions and making sure you don't accidentally depend on idiosyncrasies of your box.
The exception is if your python thingy is "the main thing" running on a server, i.e. your customer facing webapp.
I once threw a relatively complex Python application with background server/client processes at Cython and the generated .exe literally just worked without any special effort. I don't know how transferable that is, but N=1 it's not always as hard as what you're thinking.
In other words, it becomes the concern of the person shipping the code, rather than the concern of the person trying to run the code. That's exactly how it should be.
Are people still suffering through hosting Docker containers on Windows? Why would anyone do that at this point other than to comply with outdated, arbitrary IT policies?
They also can’t run any program outside of a computer and OS; their are some basic prerequisites to running software - Having a Docker/container host has become one of those prerequisites for many applications, but it actually reduces the headache of numerous other traditional prerequisites.
I don't want to have to run a simple Python program in a container for quick and simple development or testing. That's a failure of engineering discipline. By all means, do provide a Docker container and do use containers for actual deployments, but also make it easy for me to just use, say, pip-tools or whatever else your organization has standardized on for Python. If we're talking about something with complex C or C++ dependencies that's quite different. If it's just a few pip dependencies and there's no way for me to just run it reliably outside of a container, though, that's a result of not following best practices.
Agreed, I typically include a README as well as a requirements.txt so one can easily 'pip install -r requirements.txt' and then 'python app.py' to run simple apps without a bunch of rigamarole.
I use constantly Docker in my job and projects yes.
Yet, I do not believe and advocates it gets rid of the complexity.
According to the user needs, your dockerized application will run with different base distro. Alpine and musl for small OS footprint ? Or Debia(or debian-slim) for glibc compatibility ?
Those concerns are the same with or without Docker. Docker makes things easy, just not those things because it is not its purpose.
I typically specify these things in the Dockerfile - if the end user wants to modify the Dockerfile because they prefer Alpine over Debian... they've now taken responsibility of maintaining their customized Dockerfile and ensuring that everything runs as expected. This doesn't seem like something that would be encountered with any frequency in my experience, and you would technically have the same problem with or without Docker in the mix.
In the professional world, your end user is either :
- someone without the skills to make a Dockerfile
- another team who has not the responsability to integrate your work
The packager of an application is part of the project's team. It's not up to the user to package your application.
- Doesn't require any build steps or extra hoops if you're fine with skipping static types
In general it just does a really great job isolating from the environment. No messing with environment variables, most things even run fine on Windows out of the box. All you need is node itself installed and you're off to the races, whether you're starting a new project or running one you checked out from github.
For a trivial-ish command-line tool, I've enjoyed using pyinstaller with --onefile to put out a single file executable. Using GitHub Actions, it was also relatively easy to create cross-plaform releases.
It's evident from reading the OP and previous similar posts on HN that many developers find it difficult to specify and replicate deterministic Python environments for their applications. Personally, I have found it best to use (a) a virtualenv or conda environment, with (b) a requirements file that specifies fixed version numbers for packages (e.g., `pandas==1.0.3`). Only very rarely have I run into issues doing this; it works quite well for me.
--
That said, from a security standpoint, I'm not sure it's a good idea to run a script downloaded from the web, without verification, on your local command line:
curl https://pyenv.run | bash
If that URL ever gets hijacked, you would be running malicious code. At a minimum, you may want to take a look at the script before running it, or otherwise verify that you're downloading what you actually want to run.
I think `curl | bash` is treated unfairly. Whether you `git clone` or `curl` a script, you are fundamentally doing the same thing: downloading and executing code from the internet. `git clone` just feels safer because it is hiding that fact under layers of abstraction.
If I want to run pip, I need to trust PYPA. It's their code I want to run, and I need to download it one way or another. If I don't trust them to keep their domain secure, I don't see why I would trust them to keep their github repo secure.
And the whole point of pip is to download code from PYPI and run it. pip, git, curl|bash, all do the same exact thing in this case. curl|bash just smells funny because it makes it more plainly obvious what is going on.
Don't you get a bunch of incompatible packages when you restrict to specific fixed version numbers without indication? I guess this is only helpful if you don't plan to reuse your code in another project.
Good question. The approach I proposed works for production applications that require an easy-to-replicate, deterministic environment, but I wouldn't recommend it if you're trying to build, say, Python packages or frameworks meant to be used in diverse environments.
Perhaps this is just my own inexperience showing but I haven’t ever had an issue using venv (which is included in python3 now) and a requirements.txt file.
I think that at this point all the ceremony around setting up and deploying a python project outweighs it's 'easy to read and use' aspects. Unless there is a library you can't live without or rewrite, it seems like a language with better tooling and real benefits of a type system is a better choice.
If you're mainstream: Go or Java.
If you're edgy: Nim, Scala, or Crystal
All of those have much more sane type, build, and packaging systems.
@perl-people, was this a solved problem when Perl was big? Or is python walking the same roads?
It most definitely was not a solved problem when Perl was big, as far as I know Ruby is the language that finally solved it with bundler, which was released after Rails was, so that's like 2007 or something.
The recipe for the 'solved' deployment:
- a compiler that ships with a build tool (so building is homogenous in the community)
- a centralized or at least uniformly accessible package repository (so dependency acquisition is homogenous in the community)
- a common file format for describing dependencies (so dependency resolution is homogenous in the community)
- a common file format for locking dependency versions (so deployment can be done reliably without vendoring dependencies)
- optional but very nice: a tool for managing compiler versions so it's easy to switch/upgrade projects
Any programming language that has all of these boxes ticked is a modern programming language in my book. As far as I know Ruby is the first that ticked all of them, but other ones I've used that have this: Node.js, Go, Rust, Haskell, Python (though it's a bit messy). I'm pretty sure C# checks them as well nowadays, but I haven't used it in over 5 years so I'm not sure. Same for Java.
Somebody needs to tell the Racket people about this. Dependency management in Racket is still C-style no-version-pinning-anything-goes. The third party dependency managers (see below) are all primitive and do not support multiple versions of the same libraries which is a standard feature in Go/Node.js/Rust/Java class loaders
I’ve been toying with Nim lately to wrap C++ ML libraries, which gives a nice Pythonic syntax but compiles to a binary wrapping the ML lib. Seems to work well for Torch. There's a nice wrapper library nimtorch for Nim [1]. It's a bit out of date bit would be easy to update, probably. Well easier than bundling pytorch on an embedded device. Even manually wrapping the needed C++ libraries isn't that hard in Nim, IMHO.
Overall looking at Python deployment story after using Elixir for the past couple of years makes me cringe a bit. Rather I have no idea how to do it. Deterministic versions, lock files, and container/tarball (or binary) support seems a given in 2020.
Yes and no, Pytorch (like Tensorflow for Python) is just a wrapper around a core C++ library. So you can compile a binary from C++ linking to the pytorch libs without including Python. Technically it's pytorch, but without Python and Python dependency management which is way simpler. Nimtorch let's you wrap the C++ api of pytorch with nice Pythonic looking code, a GC, but using only C++ and possibly statically linked. Win win.
Yes I understand pytorch quite well - I was merely confused because I consider the C++ library to still be pytorch.
I don't see the advantage of this setup over just loading a torchscript model in C++ or any other static language. A full set of bindings seems unnecessary unless you need to train in nim.
True, my terminology was a bit outdated as I keep thinking of pytorch as just the Python wrapper on libtorch. That's not really true.
I prefer to avoid Python nowadays due to the pain of dependency management, a nice as this article is don't care to learn about Poetry. So training in pure C++ (or better Nim) is my preferred setup. Keeping a similar build setup for both training and deployment saves a lot of headaches which is why the nimtorch interface is handy even if it's not as full featured/up to date. Now I'm only deploying a very simple NN without much need for experimentation.
It does, if your extensions are provided as wheels. In which case the resulting pyz file will be runnable on any machine using the OS those wheels are compiled for.
Yeah, Shivs work ok. I've run into some stupid issues with the self-extracting directory being have too long path names on Windows and such, so it's not perfect.
I have to say, at first blush I would not choose click over argparse. I do have to look up the docs of argparse every time I use it, but I like that it's just gives me the args and lets me structure the program flow how I want to, which I think is more natural.
And then I can do things like import a big package (pandas) only after parsing the args, which is highly convenient to users that want to check the argument options without a five second lag.
I recently wrote this (https://github.com/jamesob/clii) because I can't stand click and got sick of having to check the argparse docs every time I wanted to write a CLI. I guarantee you'll spend a tenth of the time trying to figure out how to use this thing, it has no dependencies, and is implemented in a single vendor-friendly file.
I agree; click's approach is like some kind of old-school 4GL that tries to automatically create GUI elements from your database tables, except mapping CLI to functions in your modules. People should be putting enough thought into their CLI that Click doesn't really help them much.
I have a really high usefulness threshold for adding external dependencies to Python projects. If you can get away with never getting into the virtual environment mess, distribution/installation/development becomes _so_ much simpler.
Of course, sometimes you can't avoid external dependencies (personally, this often involves pandas). But the standard library gets you really far. And even though urllib.request is clunky, I will only use Requests if something else already is forcing me to add external dependencies.
I definitely agree for production-grade applications. Most of the time I'm using python, though, its to script a task I need to do or throw together a small app on a raspberry pi at home. In those cases, I have my own "standard library" of packages I always have installed system-wide, like requests.
Unless I am making a "real" application, I do my best to avoid virtualenvs altogether.
You are not alone. __main__.py in a zip has been supported litterally since Python 2.6 but very few people knew about it. Eventually in 3.5, the zipapp module has been released to make the feature more discoverable, but again it has been mostly ignored.
I think the reason is that something like shiv was missing: it streamlines the process, shows a finish product instead of just telling people "all the thing they can do", and the automatic unzipping solved plenty of problems that you had with alternative like Pex, espacially on windows or with static resources.
Urllib and surrounding web-API modules have made such great strides (and are built in in 3.x) that Requests isn't needed in almost all cases these days - I find some of the error handling it covers over at lower levels to be more problematic than useful.
90% of the time the only lines in my requirements.txt file are for PyYAML and Jinja2.
Using just the Python standard library is the ideal case for scripts and other sysadmin-ey tools - no dependencies, runs with everywhere (tested in multiple python versions via tox) in a single source file.
I think it's a very enthusiastic post. Anything can be over engineered - and generally young enthusiastic programmers are keen to learn about the options.
id be tempted to clone the structure of the posts, and replace the images with some stock photos of supermodels lying down on modernist furniture with cobras and other snakes wrapped around them or lying on the floor nearby.
Its not the language python, its the ... how to build a project that others can use. Its the scaffolding that 'just' knowing how to program does not teach you but working in a real environment forces you to learn.
Just skim-read but it seems to cover the sensible parts - all of this applies if you are writing 100,000 lines of code or 3 lines of hello.py
I am trying to write thedevmanual.com which is basically all of that - what it takes to run in real life. it is of course opinionated, and in lockdown :-(
I am not into Python, but the article lost me before it even started: you are required to install a bunch of compiler tools on your device to be able to proceed.
Excuse me? Is this really what hypermodern Python looks like?
No. Python is nice since it just works. You make a 10 line .py file, and it's super-simple.
This article lists a set of bleeding-edge tools, should you choose to add them and learn them all in one place. I wouldn't use half the tools (they're too hypermodern), but it's helpful to know where things are moving.
It’s been a long time since Python just worked. You can’t just hand someone a Python script and expect it to work on their machine. Yes, if you keep things simple and avoid all dependencies except the most trivial maybe you can get away with that. But that’s not going to be the case if you’ve got whole teams using Python.
When I run teams doing Python development, I'm hyper-disciplined about avoiding unnecessary or bleeding-edge dependencies. My experience is that dependencies save time in the short term, but lead to exponential maintenance costs in the long term. I view each dependency the same way as technical debt.
I also generally don't lock versions on dev machines; code should use the core, supported API, and not break on bleeding-edge functionality and API changes. I lock version on deploy machines, obviously.
Smart people on my teams don't like this approach, though, so I could be wrong.
But you can do Python this way. And beginners definitely should start by doing Python this way.
I'm sympathetic about bleeding-edge dependencies, but just handling fairly mundane Python dependencies is really easy with venv, pip-tools, and good standards across projects. Of course you always have containers for actual deployments.
When you say beginniners, I think it depends on whether you're referring to programming neophytes in general or professional developers who are new to Python specifically. In the latter case, I actually think it's really important for newcomers to Python to get into best practices like this very early on -- indeed, pretty much immediately. Otherwise they're going to end up either being unable to use any interesting dependencies or being unable to distribute their work in a way that is easy and convenient for others to hack on. Do this with a bunch of people simultaneously and it's a big problem.
1) There's a world of difference between that and docker, and especially docker with containers for not just postgresql, but a half-dozen specialized data stores, queuing systems, MTAs, etc.
2) There's also a world of difference between having numpy / pandas / etc. in your requirements.txt, and having those pinned to a specific version. I'm okay with one or two pinned dependencies on any specific project (for example, if there's an overall project built on Django).
But if you're using the corners of standard libraries in ways where version 1.65 works and 1.73 doesn't, you're probably doing something wrong. You're probably using features which are too bleeding-edge. I'm okay with a few conditionals in code too (if library is 1.65, do X, and if it's 1.73, do Y).
When I've seen systems that to depend on nuances of specific versions, upgrades turn into "migration to [library] 1.73" and eat up weeks of developer time. It gets worse when you have cascades (upgrading library X means upgrading Y, etc.).
And goodness help you if you want to integrate two systems built in docker with pinned everything and fine-grained dependencies.
A lot of this also comes back to willing and able to say "no" to features which take 15 minutes to introduce, but cost time down the line to maintain.
Systems which install on Ubuntu without virtualenv or pip (just apt-get installing packages) are an ideal I strive for. It's usually one I don't hit (and it's also not how I develop, obviously -- it's not for me, but for my users, as well as for the discipline).
I can’t tell if we’re in agreement or disagreement. I don’t disagree that one should avoid exotic dependencies or unstable behaviors from specific versions of libraries. Version pinning is more about just making sure someone else can run the program. It’s not about (or at least shouldn’t be about) creating a reliance on odd corner case behaviors. We almost never manually pin versions — pip-tools does that automatically.
Your argument probably would be that the dependencies used should be so simple and core that the risk of it not working with someone else’s package set should be minimal or zero. That’s just a bit too extreme for my taste. I want builds to be 100% reproducible. This is exactly what modern build tools for other languages do.
Re: Docker, I don’t think anyone is claiming pip and virtualenv are somehow a replacement for that.
Re: apt-get, we tend to actually avoid this. It’s really not a good package manager at all and can easily break. We’re going in the direction of nix instead and may even port our entire Python workflow over to it or bazel at some point.
(1) I want builds to be 100% reproducible on deployment servers and on CI/CD pipelines. Otherwise, you can undebuggable Heisenbugs. On the other hand, I don't want builds to be reproducible between developer machines. If I'm running Python 3.6 on Ubuntu, and another developer is running Python 3.7 on a Mac, and we have slightly different versions of numpy, that makes sure the system is not too brittle. Come to think of it, if I had infinite resources, I'd have several build machines with different (reproducible) configurations.
(2) I'm a lot more spartan about dependencies than other developers I've met.
(3) I'd never use apt to manage Python packages myself in something I'm working on. The constraint is in the other direction. If I build a tool, a user ought to be able to install it using apt in some future version of Debian, and likewise for other systems. Even if that's an abstract user.
I've found that if I develop this way, the upsides outweigh the downsides, especially over extended periods. A lot of software gets built like a system which can only live in one places. There's a set of AWS machines, code on them, and that's the system. There might be a few copies of it (stage+dev+etc.), but you can't move it somewhere else. I like systems I build to be portable. Someone can bring them up-and-running on their own machine, ideally in a few minutes. I've always found that to be cheaper, in the long term.
I feel like your idea in (1) is not that unachievable with finite resources. It depends how far you took it but requiring tests to pass in a few mild perturbations of the target environment wouldn’t be that expensive in a lot of cases and not even that hard to set up. Sounds like this deserves a name like “perturbative testing” to me if it doesn’t already have one.
In abstract, it doesn't take a huge amount of time and resources to do that. But, there are probably around a hundred higher-priority ideas achievable with the same resources which would take priority over this on the projects I'm working on right now.
On projects I've worked on before, I think this would have made sense /technically/, given project priorities, but so did many other things which weren't done. It's a lot easier to make the case for resources for customer-facing features than for technical debt or infrastructure. So there's the political component too, which varies organization-by-organization.
This is already done in a lot of projects with hardware. The Linux kernel will run on a thousand hardware and software configurations before integrating features.
If I did this, I'd probably want at least three builds:
* my pinned deployment versions (sometimes a release or two behind, sometimes bleeding-edge)
* latest released version; and
* HEAD
If an upstream project is introducing a breaking change, I'd know immediately. That'd be super-helpful, probably both to me and to those projects.
Come to think of it, the right way to do this might be to have three virtualenvs on my local machine, rather than just different targets in CI/CD....
Those are needed to compile different versions of Python with pyenv.
Usually distributions bundle one or two specific versions of Python. Pyenv makes it super easy to install and use all the versions of Python that you want.
Even though for most people it might be enough to just use whatever version of Python comes installed with your system, for a team it might be important that everyone has the exact same version.
Moreover, pyenv-virtualenv makes it painless to use virtualenvs and so I recommend you give pyenv a try even if you do not need additional Python versions.
Again, do these tools really require a C compiler toolchain? If so, they are completely out of the question - they add much more complexity then they could possibly fix.
Why not just install the required version of Python, maybe from a 3rd party repo if not available in the main repos? Why would I ever want to get a compiler toolchain to get my Python interpreter?
Could you explain why it is such a deal-breaker in your opinion?
It is a single `apt-get install` (which probably downloads less MBs than our typical `npm/yarn` install).
Consider that if pyenv came pre-packaged for ubuntu/debian, those packages would be runtime dependencies and then you would just need to `apt-get install pyenv`.
Can't you simply download Python 2.7, 3.6 and 3.7+ to separate folders, and use those in your different projects?
I understand that virtualenv and maybe even pyenv are useful if you need different requirements for different projects using the same version of Python, as apparently pip installs packages globally. But for your setup, I don't get why something like pyenv really helps...
> Can't you simply download Python 2.7, 3.6 and 3.7+ to separate folders, and use those in your different projects?
Yep, most people can just do this. Then, they want a little script over the top that downloads the different versions of Python for them -- just to make life easier for them. Wouldn't it be handy to also script the installation? It'd also probably be useful to automatically setup the version of Python I want to use when I switch folders, so I'm not constantly running the wrong version when I change projects.
If you don't need to use multiple version of python like me, then you don't need it. Just use the built-in venv module in python 3 to create and manage virtuanenv. for example, simply run `python -m venv .venv` to create a new virtualenv in your project dir, and vscode will automatically recognize it when you add the project dir into your workspace.
But having c compiler toolchain available is pretty much standard practice when using python (or nodejs, ruby, etc) because you might need to install some library that requires compilation from source (e.g. psycopg2). If you don't want that, you'll stuck with installing those 3rd party libs from your distro's repo, which might be out of date or outright missing (especially for less popular packages).
> Why not just install the required version of Python, maybe from a 3rd party repo
That's the thing. pyenv just downloads the source code of the specified Python version, and then compiles it on your machine. They did that to be agnostic of the OS you're running on.
My workplace got burned recently using this tool for some deployment reasons . I have to be honest , my experience with it in other ways wasn’t ideal. I won’t be using it again.
It’s a failure of the author to explain intent. The entire article is WHAT, not WHY.
If you are already hyper-familiar with python, you know what you are looking at. Yes, of course I’ll want to use pyenv, “everyone knows that”. For the rest of us, it just seems like a lot of steps to do - for some reason.
I think it’s different from just not being the target audience. A good tutorial explains intent.
Great read! I see a few approaches I'd be interested in adopting. Nox sounds particularly useful for tooling consistency across development environments.
It's worth noting that this a rather opinionated toolset, and that we shouldn't mistake opinionated for hypermodern. You could replace poetry for pipenv (particularly now that it's being maintained again...), and I tend to prefer unittest to pytest. The further into it you get, the more opinionated it is -- for instance, code coverage support and CI/CD is not one-size-fits-all, but the choices of CodeCov and GitHub Actions are nicely illustrative.
Please no pipenv. A horribly managed project with worse design choices than Poetry. And a confusing name to boot (it makes it sound like a clean bridge between pip and pyenv, but it's definitely not). On top of everything you can throw in Ken Reitz's ego. Literally the only reason pipenv gained traction is because there are still an unnerving number of people in the Python community that think Ken Reitz shits gold, when he is clearly more of a one-hit-wonder (love requests btw).
Even the testimonials are self-important.
> Pipenv is finally an abstraction meant to engage the mind instead of merely the filesystem.
Like, really? I want to interpret that as a joke, but I don't think it is.
Poetry should be the future of Python dependency management. I'd like to forget pipenv ever happened. Unfortunately since PyPA took pipenv under its wing, that might not happen.
I often end up frustrated when jumping back from pytest to old projects that are still using unitest
I really like pytests's fixtures and it's caplog stuff. Out of interest, what do you prefer in unittest?
Wouldn't hyper modern Python rely on containerization for isolation and portability? Poetry and pyenv seem like incremental improvements rather than a qualitative leap forward.
Heh, but that's one modern thing I don't like. Someone complained about click adding to startup time, running a new container is certainly "modern" but tiring.
It's certainly overkill for a one man side project. But if you have a team and deploy to clustered servers, virtual environments sure start looking like a horse and buggy.
Yes, this! I'm not against the language evolving, but why do so many people seem to want to turn it into JavaScript? I've been using Python for the last 10 years, and when learning it went out of my way to make sure I was using the idioms of the language and not just use it as a different language with a Python syntax.
Those tools have been existing for years, and as always with Python, they serve only a subset of the Python users.
Not that it's a bad article. If you don't know about them, give it a read, it's good to know it exists and the post is well written.
But don't think it's the ultimate stack or whatever.
And if you learn Python, you should always strive to master the basics first:
- installing python.org for windows and mac, or apt/yum on Linux (those ones are tricky, it's more than just the "python" package to get pip and venv)
- don't try to get the latest and greatest version (which is 3.8, or 3.9 alpha today). 3.6 is great already and is wildly available. I personnally strive for 3.7 now, but I'm happy with 3.6 if I don't have to use asyncio. So get the most modern version that is easy to install for you.
- be comfortable with "-m", the "py" command on windows, "pip" (-m, --user, install, install -r, freeze), and "-m venv".
- make sure you understand what the PYTHON PATH is and how it works regarding the import system
- know how to configure your favorite editor to work with those
Once you have those basics, you can go and try whatever have you: peotry, pyenv, pew, virtualenvwrapper, pipenv and so on. It will be easy because you will have a solid understanding of where it comes from and how it plugs in into the ecosystem. You will be able to chose if they are worth it or not for you. And more importantly, when you will have to move away from them (because of work, because they don't suit you, because they are abandonned...), you will always be able to fall back to basics.
But franckly, there are so many things that you should be learning before those extras: pdb, black, pylint, zipapps... Or even concepts like generators, decorators, unpacking, etc. They all have a better value for the time you spend on them than learning a new "modern" stack.
Not that it's not useful to have a modern stack. I update my stack all the time.
I'm currently using pip + pew + doit + nox + pytest, and I'm experimenting with dephell.
But it's my full time job. I'm an expert.
Most Python devs have a much limited amount of resource to spent in learning stuff. They have deadlines, and other stuff to care about than Python.
I know because I go from company to administrations to students to train them, and it's always the same: they chose Python because it's an efficient use of their time.
Context is important. Developing applications vs libraries vs scripts is different, developing internal company software vs external software is different.
If you're shipping a package on pypi, you will want to develop and automatically test using multiple Python versions and maybe a range of versions of your dependencies. Having good tools to manage multiple sets of Python versions and dependencies and parallel automated testing is a godsend.
If you're writing an internal or web application that will run only in a single well-known environment, pip + venv + pytest might well be good enough.
Sure, but again, there is nothing modern about this. And certainly not, hyper modern.
Besides, if you want to develop and automatically test using multiple Python versions and a range of versions of your dependencies, none of the tools from the article will do it.
At best they are one of the ways to get dependancies before you use a tool to do it. You can do that with raw pip and venv as well.
What you need though, is something like tox, which has existed for years.
As mentioned in my comment, I personnally use nox for this, as it is, ironically, more modern to my taste.
Nox doesn't care if you use poetry, pew, virtualenv wrapper or venv manually.
I would certainly advice to learn something like nox or doit before learning poetry, for example.
Now I don't advice against learning poetry, mind you. It's a good tool, well written.
Pyenv is another matter entirely. The day one really need what pyenv gives you to the point it's worth investing in it, they can discard all my advices entirely, as they would have the skill and experience to not need my explanations.
Not sure if you've read just the first article, it's a whole series, they cover nox later on.
I don't see how this stack is "hypermodern" either, mind you.
My point was just that these articles and other like it tend to omit context completely. What is this stack good for? Who should use it? Who shouldn't? Without answering these questions the articles are not all that useful, or even could be harmful, if a dev who doesn't know better apes a complicated stack for no good reason.
My team has struggled a lot with how to do python deployment.
We know we can isolate the interpreter and modules via venv/docker/etc. and get repeatable and reliable deployments. However, because we're a utility module, we like to allow users to import our code freely, and that's a lot harder when that code is isolated and bundled with very specific requirements.
Seems like the only perfect solution is to just support every conceivable version of python and our required modules. Which is of course very hard. It would probably require greatly reducing our usage of tensorflow and some other packages which change a lot and are trickier to use on Windows.
What's the benefit of using pyenv? I've been using `python -m venv .venv` to create virtualenv within the project directory for a while now, and VS Code recognize the .venv directory as python virtualenv and apply it to the workspace automatically. So far it's been quite painless for me.
Pyenv is for installing and managing python versions on a machine. I don't think it makes sense to compare to virtualenv as that doesn't install python only copies it from the system into the current directory. You could use both tools.
If you want to use different versions of Python itself, without installing them as different named binaries or using shell aliasing, etc. Helpful to install Python 2.7, Python 3.x, 3.y, etc. and invoke each as simply `python ...`.
They serve a different purpose. Pyenv enables you to maintain several Python versions. You can that use the venv module of each respective Python to create a virtual environment with a Python at a given version.
I see, I think I confuse it with pyenv-virtualenv. I never need to use multiple versions of python so I never try it so far. I always install the latest version globally and use python 3 docker image as the base deployment image.
If you have multiple projects, it may be undesirable that you're forced to have them all on the same Python version. Of course it helps that Python has fairly good backward compatibility, and even using a newer version to develop and test than the one you'll deploy with is not the end of the world, but pyenv is easy enough that I see no reason not to do it properly.
I guess that there are many tools for environments now... my personal choice is using conda (I actually start with miniconda and download what I need from conda-forge).
It provides the needed isolation, you can get lots of packages without having to compile from conda-forge and it still works with pip for the odd cases where a given package is not directly available (as a bonus, it works for other native toolchains, not only Python, which is a huge plus for me).
The only thing I never understood about conda, is why it does not support packages from pypi.
I mean, the Python community has a standard repository for packages (pypi). You cannot even name a library $something if $something is not available on pypi. Why would I use something
hosted by a private company that does not even interface with
the rest of the community?
This is not an attack on conda. I just cannot understand its rationale.
The real problem (which is solved by conda) is that pypi doesn't solve dealing with non-python packages well (say, compile scipy and make all related native packages communicate well). This is a huge issue for people dealing with scientific packages (and the main pain point which pypi being just python-focused on installing on site-packages doesn't solve well enough IMHO).
Also, while it was initially done by a private company, I'd say it's definitely a community effort right now (it's also the reason I tend to use https://conda-forge.org/, which is community driven and not anaconda).
As for dealing with pypi, it does integrate well enough for me (given that you can just pip install packages on the python for which conda is managing the env), but yes, the other way around isn't true (conda solves a bigger problem than pypi up to the point that it's possible to even have non-python tools available -- one real use case example I have here is having innosetup binaries as a tool in the PATH in some conda env for doing builds).
Note: I don't have any affiliation with any of that, these are just my preferences for managing python envs (when I'm developing pydevd, which is the debugger engine used in pydev/pycharm/vscode, many times I need to reproduce some weird env and before using conda that was pretty annoying).
it’s not an abstraction over virtualenv at all; pyenv installs python interpreters, and those interpreters should use -m venv instead of virtualenv anyway.
did you mean pipenv instead? we suck at naming things (and pipenv has its share of issues, no arguments there).
All of these virtual environment managers over complicate dependencies with Python. Using a simple virtual environment is quite easy. All you need is Python, its built-in venv module, setuptools and pip. I've found pip-tools helps with generating/merging comprehensive requirements.txt files.
> This article series is a guide to modern Python tooling with a focus on simplicity and minimalism
I'm not sure using pyenv + poetry + click qualifies as a minimalist setup.
I have never had to use pyenv as I figured out it was more robust to install the python versions directly from python.org and create the appropriate symlinks. Then using the venv module which has been included in python since 3.3.
The argparser module of python is not that bad once you are used to it.
I'm not saying the tools and libs mentioned in the article shouldn't be used, it's just that they're not mandatory for whoever wants to stay close to the bare minimum.
I would use pyenv locally so I can easily switch versions but dev and prod is a Docker container with one python version, then poetry install the virtual environment.
You don’t even really need to install into a venv, since you’re already in an isolated container, but going straight for a venv is kind of a reflex for most Python devs
Once you get in the habit, using pyenv/pipenv (in my case) is incredibly quick an easy. I use it for almost everything Python related I touch, except for quick scripts of the REPL.
This is exactly what I needed as a somewhat on/off longtime user of Python.
It basically boils down to using pyenv, poetry and setting up the pyproject.toml and project src layout.
I read the comments before the article and was expecting a MUCH more complicated content and intricate bash, tool or config setups but it seems straightforward.
I feel everything else is part of getting started into professional software development (git, modern OS, env vars setup, bash or shell scripts, etc) which is it's own right complicated for newcomers.
I think deployment and packaging vary a lot depending on the target deploy env which is why I'm fine with it being left out of the article.
One thing that I haven't been able to figure out with the 20 minutes or so of reading about poetry that I just did -- does poetry support editable installs akin to pip install -e .?
Reading this article (which asks me to install 18 apt packages on my system) and the top comments here (complaints about developing, running and deploying apps, replicating deterministic Python environments and so on) makes me wonder why Python developers don't use Docker.
I think you're confused: pyenv is not a replacement for venv. It's just a tool for installing Python itself. It never occurred to me until just now that the name is confusing and makes it seem like it's related to venv.
I think that given pip and venv are bundled with Python, then anyone who wants to introduce new dependencies to replace them needs to justify themselves. I haven't used either pyenv or poetry so I have no opinion on them.
The benefit of pyenv seems clear to me - it's a simpler way to have multiple Python installs and manage which one is your "default" Python at a project, directory, user, system, and global level (they can even all be different).
Poetry...not so much. requirements.txt is good enough for me most of the time.
Poetry gives you exactly what pip and venv give you, with the two pretty much perfectly integrated, in a way that is pleasant to use and not something that you'll grudgingly migrate your project to after your fifth dependency.
Well, poetry gives you a lock file, separation between dev dependencies and runtime dependencies, and easy deployments to PyPI.
If you're hacking a oneof-script you might not need these features, but for "real" projects Poetry is invaluable.
I've been running stacks with venv and pip for years and it's been perfectly adequate. The main addon I use is virtualenvwrapper which basically just give sugar for mkvirtualenv and workon.
> Yeah, but pip and venv don’t leave me impressed, so please elaborate why those are better.
Maybe you misspoke, but whether you are left impressed or not is your opinion. What is factual is that these tools are the ecosystem's default tools for their jobs, and have been for a long time. I have relied on them operating multiple production codebases over the course of a decade.
I think you need to elaborate on what about that doesn't suit your needs rather than that they don't impress you for some vague, unspecified and potentially arbitrary reason.
To answer your question, why do they impress me? They get out of my way, they work, and they don't require me to do anything but have a system version of Python installed. With them, I get reasonably sane package management and environment isolation without headaches. That's enough for me.
I don't think this is for a newcomer. Newcomer, here you go:
print("Hello World!")
This is for someone who has done Python for many years, but might have fallen behind on the latest and greatest trends. It collects a series of promising, bleeding-edge tools and gives an overview of what makes them clever and how to use them.
This pops up occasionally on HN but I don't get it. Are you reading every line of every installation script you run normally? What about the things they download and execute? Are you never running a binary installer where you don't have access to the code?
This post is a snapshot of one webdev's toolbox. It certainly is indicative of webdev's predilection for increasingly complex and opaque tooling, mixed-markup formats, and over-engineering (if it can even be called engineering).
You don't really need any of that stuff. That whole tooling ecosystem undergoes a rewrite every couple years, anyways, so I predict this post will not age well.
Just stick to setup.py, pip, and virtualenv. It's sufficient into all thy needs.
I was with you up until your last sentence. pip and virtualenv are hacky kludges that grew legs and lungs, that's how old python's issues around deployment are, there's an era before pip and virtualenv.
There should be an abstraction for it (deployment, versions, import hacking) like what pathlib did for file path manipulation.
> What makes you think the newer tools are better than pip and virtualenv?
I don't. Heck, maybe they are. I don't use them.
My point is that rather than being the beneficiary of Python's excellent abstraction powers, distribution et. al. is and has been a mess for decades.
Here's a history of the sordid mess starting from 1998: "The distutils-sig dicussion list was created to discuss the development of distutils." https://www.pypa.io/en/latest/history/
FWIW, Since pip and virtualenv became relatively stable and "blessed" and PyPI has matured the only new thing I've tried is Anaconda.
Unfortunately my main Python project right now uses Tkinter and the folks who make Anaconda have a circular dependency in their build system such that you need python to build Freetype so their Python/Tkinter/TCL/Tk has gruesomely-bad support for fonts, so my project looks like a potato. https://github.com/ContinuumIO/anaconda-issues/issues/6833 Someone has put in a PR to hopefully fix it, and-- ah! --it seems to have gotten some attention earlier today: https://github.com/conda-forge/tk-feedstock/pull/40 Fingers crossed for great good!
I sort of agree, but also think you're handwaving away some actual pain points. pyenv makes it really easy to manage multiple Python installations on the same system. Also, while I prefer pip-tools over poetry for its simplicity and the fact that it has stood the test of time, both accomplish the goal of pinning all direct and transitive pip dependencies.
Having to ship a compiler to a host|container to build python with pyenv, needing all kinds of development headers like libffi for poetry. The hoopla around poetry init (or pipenv's equivalent) to get a deterministic result of the environment and all packages. Or you use requirements files, and don't get deterministic results.
Or you use effectively an entire operating system on top of your OS to use a conda derivative.
And we still haven't executed a lick of python.
Then there's the rigmarole around getting one of these environments to play nice with cron, having to manipulate your PATH so you can manipulate your PATH further to make the call.
It's really gotten me started questioning assumptions on what language "quick wins" should be written in.