While you need the things on bzt's list, that doesnt mean you have to learn them on your own in a vacuum. How you learn to read technical documentation, in part is by asking someone, a mentor or the general public as that is a thing now. Do learn how to ask the question, dont come in saying I am going to build my house but I cant figure out how to operate the nailgun. You simply say I cant figure out how to operate this nailgun. If someone asks why you say i want to learn how to use it, I might want to build something with it some day. but cant until I learn how to use it.
Not all operating systems are created solo, great if you can pull it off, but not all are done that way. Some folks may do the high level design, some do mid or low level design, some do various parts of the implementation. A number of folks start off doing the grunt work for the senior folks to nap most of the day and bark orders or give advice or tell long stories about way back when (folks like me). And as you do more grunt work for your mentors, *IF* you pay attention to what the others are doing and try to understand it, occasionally asking questions, good, bad or otherwise. You work your way up the ladder to being that senior person, which in theory can build the whole thing solo, or can just bark out orders and nap.
From your original question, step one, get a toolchain, even better get three or four.
step 2: cross compile an OBJECT, and then disassemble to see what was produced.
step 3: link one or more objects. start to master then linker or at least beat it into submission enough to control where the sections go and in what order or at least have the vector/start code in the right place in the binary.
then the processor learning starts, how does this processor boot, what do I have to place where to get it to boot or continue to boot from the bootloader and have my code take over this processor. unfortunately the pi 3 for example has multiple paths you can take, most painful for experienced folks. getting a pi-zero and starting there is very much in your best interest. stay on the pi-zero for a very long time. OF the platforms out there the pi is a very good one for your overall goal, but start with the pi-zero, which may include a little soldering or somehow figuring out how to get the uart hooked up. If you can find a pi A+, adafruit has them in stock now, the pins are there, no soldering required, a couple to ten dollars for a usb to 3.3v serial (which adafruit sells, multiple solutions, or ebay).
This involves reading arm docs, broadcom docs, learning that documentation is often wrong or has holes. there are schematics for the original pi which to some extent apply to the following pi boards, or at least can get a feel for one of the designs, the latter designs even the A+ are different and by the pi-zero and pi3 probably more so, but still being able to see that on at least one board there was a chip, it had some gpio, one of the gpio pins is tied to an led that is described in old baking pi or other tutorials. then see that someone has simply given you the answer that on the pi-zero it is no on this gpio pin (or on the pi3 it is not connected to a gpio pin). can also see in those schematics where the pins we use for a uart are, what gpio pins they are and in the broadcom docs how that gpio pin has a set of alternate functions one of which is uart tx or uart rx depending on the pin.
master the led, master the led.
master the uart, master the uart, master the uart.
beat a toolchain into submission, beat a toolchain into submission. you can see in my examples I do almost nothing to beat on the toolchain yet I have near complete control. some folks like a lot of knobs and features, I like to use as little as possible, ymmv.
If you dont have access to oscilloscopes and know how to use them or other such tools (although you can get a pretty good one for like $650 now) then the led is your number one debug tool, the uart number two. perhaps other boards pis or microcontrollers which you can build your own logic analyzer-ish/scope-ish adhoc tools are what you can use. Even having a scope can be as much of a problem as a solution, sometimes touching a probe to the circuit fixes it, sometimes it breaks it...gotta know your tools.
interrupts, processor modes, etc. lots and lots of getting your hands dirty...to work through the kinds of things in bzt's list. and a number of those learning experiences require mentoring from other folks. online or in person. even the beginnings of an operating system are setting up a timer based interrupt and saving state such that you can switch between two tasks without corrupting/crashing one of them, long before you have to design a (more complicated) scheduler, long before you have to even read about protection, etc...Get the books by Tanenbaum or Labrosse or both or others.
I hate to say it this way but if you are struggling to find a cross compiler for arm, you are not ready for the armv7 nor the armv8, thus the reason i say start with the pi-zero (or A+) (or the B+) or woah, the pi-zero-w at adafruit has headers. You can hold off on having to learn to solder. Although I recommend somehow hooking up a momentary on/closed pushbutton to the run pins, so you dont have to unplug and replug the board to try again. if you can solder, awesome...these https://www.sparkfun.com/products/97
work great on the raspberry pi boards, break off two of the legs so they dont short with anything, take the other two and twist them and move them closer together so you can fit them in the holes and solder them in place.