Questioning Microsoft publicly is dangerous

Written by lilithriver | Published 2017/01/01
Tech Story Tags: development | microsoft-open-source | microsoft | github | package-management

TLDRvia the TL;DR App

Note: This article was originally posted as github.com/Microsoft/vcpkg issue #478 regarding docs/FAQ.md, and its rather careless misrepresentation of the community alternative to their project.

My account was taken offline within 20 minutes, along with all my Github repositories, comments, (and the post in question). I’ve had an active account (and credit card on file) for nearly 8 years, but never had this happen before. If I go to github.com/contact, I am presented with the rather ironic text:

Need help with the Microsoft/vcpkg project? You could get in contact with @Microsoft. For example, you could open an issue on the repository or send the maintainer an email.

Update: My account is back online for now, thanks to @nothingstrivial. I’m very grateful for his help, particularly given holiday celebrations! Looking forward to detail on how to avoid this in the future.

For the moment, it might be best to avoid interacting with Microsoft repositories, particularly if you rely on GitHub for CI dependencies or website hosting. I heard (unofficially) that high-value targets for trolls receive unique handling. I’m not aware of any way for users to evaluate this risk.

“Why not Conan?” FAQ section confuses and misrepresents

All 4 portions of the “Why not Conan?” section of the FAQ either lack logical coherence or are completely orthogonal to technical merit and tool suitability.

The “Public federation vs private federation” section argues against permitting individuals to publish packages, stating

“we believe there should be a single, collaboratively maintained version which works for the vast majority of cases and allow users to hack freely on their private versions. “

You don’t need extreme (and unfortunate) technical limitations in order to create a walled garden. Just run your own Conan server, and only authorize uploads from the Microsoft account. You can keep the ports directory — just replace ‘CONTROL’ with conanfile.py (there’s an automated tool for it). Only permit CI to with that account — and voilà! You’ve perfectly mirrored the existing access permissions and model for vcpkg, and done so without crippling user freedom.

You can argue for curation, sure, but baking that into design for build dependencies doesn’t end well. I love Homebrew, but we’re lucky to get non-core packages like libvips updated within 2 months. Reliance on curated listings doesn’t scale. If you do curation, strictly limit to to what you can commit to manage and respond to 0days for. Encourage and support the community in doing the rest.

If you want to improve discoverability of ‘official’ packages in Conan, submit a PR for the package sorting algorithm. A convention of prioritizing packages where the username and package match would be an nice, zero-maintenance way to allow authoritative versions through username reservation. Or just only allow 1 instance per package name. We’re talking about trivially tweaking the package server. Your first reason lacks substance.

Per-dll vs Per-application. When dependencies are independently versioned on a library level, it encourages every build environment to be a completely, unique, unable to take advantage of, or contribute to a solid, well tested ecosystem [emphasis mine]. In contrast, by versioning all libraries together as a platform (similar to a system package manager), we hope to congregate testing and effort on very common sets of library versions to maximize the quality and stability of the ecosystem. This also completely designs out the ability for a library to ask for versions that conflict with the application’s choices (I want openssl Z and boost X but X only works with openssl Y).

This contradicts decades of evidence. All other development package managers (that I know of) allow libraries to specify dependencies. This isn’t something the entire world got wrong. ABI fragility isn’t unique to C/C++, just accentuated. We’ll talk about system package mangers (and their role in C/C++ software development) later.

Conan supports per-application versioning, and dependency version overrides. Your application just uses another conanfile.py to specify dependency versions. That said, I’ve been just fine (and override free) with Conan’s ecosystem.

The maintainers of a library are usually in the best position to know which versions of its dependencies should be the default. You use the word ecosystem. I do not think that word means what you think it means. For in the previous paragraph, you said:

we believe there should be a single, collaboratively maintained version

That is …not an ecosystem. But we will discuss ecosystems later. You continue

we hope to congregate testing and effort on very common sets of library versions to maximize the quality and stability of the ecosystem.

So.. “versions”. Okay. I like the concept of opt-in platform versions. I.e, “here’s a set of library versions, all of which are compatible”. And here’s a new set of versions. etc.

That’s great. That could also come in the form of:

  • A git repository with a single plain text file of conan version numbers.
  • A trivial pull request to add said feature to Cargo.
  • Something not claiming to replace a package manager. I like the name “Port Tree”. It doesn’t imply anything more than it is — a collection of build scripts for various curated packages.

Doesn’t vcpkg only support 1 (user-wide) set, though?

An enforced system-wide version set is exactly the pain we’ve been trying to escape from for decades on linux/mac. We’ve invested tremendous effort in escaping this situation. It completely prevents cross-platform development, because no two operating systems will contain the same version set.

Don’t you think designing out the ability for libraries to specify dependency versions is a bit overkill? That’s one way to ensure an ecosystem can never develop, because packages can only have good usability if they live in an official platform set.

Your choice of OpenSSL as a conflict example is a good one. Across my projects, breaking OpenSSL API changes have painful. But none of that pain was present for Conan-managed libraries. I can clearly see what versions libraries support, and if I need to update something, it only takes a few minutes. How would a Windows-system-wide copy of OpenSSL be any less painful with vcpkg than it was with homebrew or apt-get? Please describe — this would be an excellent addition to the FAQ. I would describe the balance and boundary between curated and non-curated software, and how the size of that surface affects the suitability of this tool.

Reason 2 is hyperbolic doublespeak.

Cross-platform vs single-platform. While being hosted on many platforms is an excellent north star, we believe the level of system integration and stability provided by apt-get, yum, and homebrew is well worth needing to exchange apt-get install libboost-all-dev with brew install boost in automated scripts.

I don’t use homebrew or apt-get or Chocolatey to manage my dependencies. I could, in theory, develop within Docker containers, and use apt-get or homebrew inside them. But I’m not forced to, because Conan exists. I have full package isolation, and can support as many different sets of versions as I want. Today, I’m simultaneously running Conan on OS X 10.11, Windows 10, and Ubuntu 14.04. I have several different project with different needs, and everything JUST WORKS.

This is nice. It’s what I’m used to experiencing with RVM+Bundler, Virtualenv+pip, Rustup+Cargo, npm, and even NuGet (modulo a few bugs).

Let’s clarify some terms:

  • System package managers: apt-get, homebrew, Chocolatey. Only one version of a package can exist on the system. One can have gcc and gcc5.3, but they are separately named packages. You can’t run gcc5.3 by invoking gcc. The other limitation of apt-get, homebrew, and Chocolatey is that they only install with root access.
  • Dependency package managers: Bundler, npm, Cargo, pip, nuget. Usually associated with a certain language or set of languages, where the package manager’s design often takes specific needs of the language’s compiler/runtime into account. These usually rely on the system compiler/runtime for their language, and are therefore combined with environment managers so that developers can work on more than 1 project at a time (assuming different projects need different versions). There are usually multiple instances installed if you’re using a version manager. Many offer primitive runtime/compiler switching, or are intentionally environment.version manager friendly.
  • Environment managers: rvm, rbenv, nvm, virutalenv, dnx. rustup. These allow you to use different compilers/runtimes — and different package manager versions — for a given language at the same time. They may shuffle storage data around so that package managers (if using user-local storage) aren’t reusing the same caches against multiple runtime versions.

In developing C/C++, we’re constrained by our system package manager’s curation as to what compilers we can have simultaneously installed. We do have the CXX and CC env vars, however, and Conan helpfully provides compiler switching. In doesn’t redefine any behavior outside of its process duration, however, so in this respect it is more like rbenv than rvm.

System-wide build dependencies preclude reasonable cross-platform development

System-wide package management works smoothly when you’re not adding any custom code to the mix, and you’re not at all picky about what version of at tool you end up with. This constraint is often unacceptable, so modern tools often ship in docker containers. A new generation of languages appeared just to solve this problem through pervasive static linking. (Which has servicing concerns)

Developing software against system-wide dependencies is a nightmare. We have no recourse against breaking changes in security updates, no method for compiling two (completely separate!) tools that need different libpng ABIs. This is unnecessary pain.

This is a very strange point in time (given Microsoft’s new strategy) to begin (deceptively, or through ignorance) promoting approaches that tie developers to Windows and ensure future pain.

We chose to make our system as easy as possible to integrate into a world with these very successful system managers — one more line for vcpkg install boost — instead of attempting to replace them where they are already so successful and well-loved.

There are successful system managers in this world, yes. I’d also say OneGet might deserve the designation. Your design does not make vcpkg easier to integrate. This paragraph tries to frame vcpkg as filling the role of a systems package manager, and is very cleverly worded.

But the goal of improving developer experience across operating systems is not well served by vcpkg. Linux package mangers came first, and distributions offered floating, but usually compatible, versions of libraries and tools in curated repositories. One could switch ‘stable/beta’ channels by changing repositories. But compatibility between distributions was still poor, and we often, through time constraint, had to pick one distribution at a time to support. Backports might happen. So when we started bringing those apps to Mac, we needed to mimic that set of dependencies. Today, Homebrew recipes often use precompiled binaries, and often fetch specific dependency versions. Homebrew also makes it possible to install specific recipe versions. We didn’t, and can’t improve developer experience by increasing the number of permutations they must troubleshoot.

It should be telling how unified and focused the entire industry has been on eliminating system package management. Dependency hell is painfully expensive in development, but catastrophic in production. Thus we got Docker, and Ubuntu Snap Packages, and a hundred iterations of containerized packages, and a dozen new cloud operating systems all with the goal of NOT sharing dependencies with anyone else, and sometimes even eliminating system package management altogether.

Reason 3 ignores modern development and devops practices, and uses some of the trickiest wording I’ve seen outside diplomatic cables.

C++/CMake vs python. While Python is an excellent language loved by many, we believe that transparency and familiarity are the most important factors when choosing a tool as important to your workflow as a package manager. Consequently, we chose to make the implementation languages be as universally accepted as possible: C++ should be used in a C++ package manager for C++ programmers. You should not be required to learn another language just to understand your package manager.

Arguments: 1. Transparency, familiarity of interaction with tool. 2. Implementation language as universally accepted as possible. 3. You shouldn’t need to learn another language to understand your package manager.

I’m going to assume that if you’ve successfully edited a .ini file, you can use the declarative conanfile.txt syntax. If you’re keeping logic in CMake, it’s likely sufficient. Conan passes CMake all the variables you need. You’d probably switch to conanfile.py if you need different dependencies depending on the operating system (say you use SChannel on Windows, but OpenSSL on linux).

For the sake of argument, let’s start with one of the most complex examples listed for conanfile.py:

from conans import ConanFile, CMake

class PocoTimerConan(ConanFile):settings = "os", "compiler", "build_type", "arch"requires = "Poco/1.7.3@lasote/stable"generators = "cmake", "gcc", "txt"default_options = "Poco:shared=True", "OpenSSL:shared=True"

def imports(self):self.copy("*.dll", dst="bin", src="bin") # From bin to binself.copy("*.dylib*", dst="bin", src="lib") # From lib to bin

def build(self):cmake = CMake(self.settings)self.run('cmake "%s" %s' % (self.conanfile_directory, cmake.command_line))self.run('cmake --build . %s' % cmake.build_config)

Is anyone capable of writing C or C++ going to have trouble understanding this?

The most magical bit is that CMake(self.settings) has a .command_line and .build_config method, and that this generator-specific logic makes more variables available to CMake than you would expect.

I can’t compare this to vcpkg, because I don’t see where it even offers a dependency list. I assume that some kind of script is used to invoke it for each package? Looks like you’re learning another language!

Argument 2. Implementation language as universally accepted as possible.

Accepted by who? The operating system? Python is the most ubiquitously installed scripting language on earth, as far as I know. It’s usually present on even the most minimal docker images. You distribute it with Visual Studio. It’s bundled with pretty much every distribution of linux and os x.

If you’re talking about comprehension by the C++ development community, that’s an unusual argument to make. Package managers aren’t usually inspected; we just expect them to work. If they don’t, we complain to specialists who understand the domain. It’s a large domain, and there are many pivots, and complexities.

Let’s look at some examples. NuGet has been around for 5+ years, but only acquired 30 significant contributors. I didn’t spot any that weren’t MSFT employees. Perhaps the project isn’t managed as a true open-source project, and that had an effect. Let’s look at a community-developed project — Chocolatey. Wow. Just one significant contributor.

Domains like package management, codecs, image processing libraries — they require a lot of domain knowledge to contribute to. As maintainers, we want to choose the technologies that we think are most likely to encourage contribution. We’re better off choosing the best tool for the task. If there’s already a tool that does what we need, we use it.

Rejecting a tool because isn’t written in your favorite language seems like a Microsoft culture phenomenon. In other communities, we aggressively blend tools written in a range of languages. Take a rails app. There are usually 1–2 tools written in an ML-style language, several C and C++ components, embedded lua runtimes, some form of npm for asset management (at least), Perl (using git?), a bit of Crystal, Rust, various Java packages, and a variety of Python apps that you probably don’t know are built with Python, because they’ve haven’t spewed a stacktrace yet.

We’re hoping to see NuGet reach 60% feature parity with Bundler by 2019. I’m told that, before NuPack, in the early stages, Bundler was being used. But it needed to be C#, not Ruby, of all things. So we resorted to … a single-platform, single-tool set of scripts claiming to be a package manager. There’s history here, and it seems to be repeating.

Argument 3. You shouldn’t need to learn another language to understand your package manager.

My response: documentation.

vcpkg should not brand itself as a package manager

There’s already an open-source package manager for C and C++. It’s well designed, the result of many real-world iterations and deployments, and backed by developers with an incredible amount of experience in the space.

  • It handles pivots extremely well across operating systems, C runtimes, and compiler compatibilities. The design is excellent.
  • It is build-system agnostic — able to integrate with everything — yet is incredible simple to understand and use.
  • It can work directly with Visual Studio, through CMake, or nmake, or any tool you like, yet has minimal boilerplate.
  • You have conanfile.txt or conanfile.py to declare your package, dependencies, extra configuration/pivots, and build invocation through which the pivots are sent. Providers for the top 12 build systems make this automatic — unless you want more control. At no level of abstraction is there pain. conaninfo.txt, the file Conan creates in a target build directory, provides visibility into the compiler settings and package versions in use.
  • Essentially cruft-free, yet handles extremely complex situations.

It is trivial to use with Visual Studio and CMake.

Ecosystem

I said I’d get back to ecosystems. vcpkg’s FAQ describes itself as the champion of a package ecosystem, and specifically superior in this respect to Conan. Let’s zoom back a bit and look at the ecosystem of package managers.

Adoption of package managers

C and C++ are hard languages for which to do package management. Developers are justifiably gun-shy — many build tools have claimed to tackle package management, but failed horribly on key details. Some build-tool-specific managers exist (cpm, hunter).

It’s hard to encourage devs to invest in new tooling, given obvious limitations (or suspicion of hidden ones). Before Conan, the core developers created Biicode. As a company, Biicode was unable to monetize C/++ package management, and eventually went bankrupt. They took the lessons learned from commercial C++ package management and designed Conan, while helping Biicode users transition as best they could, and funding their work with weekend consulting. They community grew, and They’re now a part of JFrog/Artifactory, but I’ve not seen that much dedication to a problem before.

The C/C++ ecosystem consensus seems to be converging towards CMake as a cross-platform build tool. This makes my life easier. But adoption of package management is a slower task, despite requiring a much smaller effort. The pain of system package managers has spawned a “copy the source in-tree” religion that is very hard to eradicate.

Maintainers of common packages are just beginning to add conanfile.py, and setup Travis and AppVeyor for cross platform package deployment. We badly need this trend to continue if we want low-friction cross-platform development to become ubiquitous.

Microsoft has a recorded history of displacing existing or nascent open-source projects. Often the replacement was designed without a study of prior art, and lacking in technical merit.

The most effective approach to ensuring a solution succeeds based on advertising or brand, rather than merit, is to either

(a) pretend you’ve never heard of them, and be silent in places where users trust you to credit prior art, or (b) use a disinformation strategy to ensure the competition’s role and merit is misunderstood.

Given that a massive contingent of the software development community looks to the Microsoft brand for leadership in tool and technique selection (evangelism does its job), I would argue that a majority of Windows C# and C++ developers entering the open-source space are doing so (at least in part because) Microsoft has embraced open-source. They still favor tools Microsoft signals interest in.

Herd mentality overrides almost all else, and software developers participate in this intentionally, perhaps hoping that usage of a thing will delay it’s inevitable, eventual, abandonment. Microsoft may no longer be the bellwether of the software industry, but it is still the bellwether for an extremely large contingent, and that is quite enough to start a herd.

There’s some irony that download and Github issue count may have no bearing on software longevity — the Contributors tab is what matters — but in adoption, perception is 90%.

I like to think that there is a hangover from closed-source culture that takes a long time to shed, and that this hangover (rather than any particular intention) is responsible for the more careless events that have so strained Microsoft’s relationship with the open-source community in the last decade.

Microsoft employees who have been active for a very long time in the larger open-source community seem more careful to consider the impact of their publications, and more effective at reaching their goals through collaboration and coordination. More wood behind fewer arrows.

Many people argue that 1 package manager per language is too many, and unsustainable given the domain difficulty and contributor pool. They’re right. It doesn’t matter, though, because software developers need package managers for rapid development, and they need language-specific features. We can (and have been) unifying the science of package management, and this helps, but there are simply not enough people for the workload. So we take what we can get. We’ll even sacrifice build reproducibility for years to get that productivity (see NuGet).

I do not see any evidence that there was legitimate prior art analysis prior to this project’s promotion. Many other rationale in the documentation are troubling.

The misrepresentation of Conan (and the wider context) in the FAQ is very troubling. We have worked hard to make open source a kinder and more sincere place.

There’s a nice tradition of mentioning close prior art at the top of their README, and providing a differentiation (or link to one) above the fold. This differentiation should stick to objective and carefully researched facts, along with each project’s official declaration of scope and goals. It is very considerate to provide the reader with all of the facts, and let them consider what solution has the best tradeoffs for their situation.

This sets the stage for a spirit of friendly collaboration and sincere work towards our common goal of making software developers happier (or something adjacent to it, I hope).

Hacker Noon is how hackers start their afternoons. We’re a part of the @AMIfamily. We are now accepting submissions and happy to discuss advertising &sponsorship opportunities.

To learn more, read our about page, like/message us on Facebook, or simply, tweet/DM @HackerNoon.

If you enjoyed this story, we recommend reading our latest tech stories and trending tech stories. Until next time, don’t take the realities of the world for granted!


Published by HackerNoon on 2017/01/01