Except for some 16-bit Intel assembler a long time ago, ARM32 is the variant I know best (although still not that great). I was looking at the fleng runtime, trying to update my knowledge to ARM64 and found some interesting things. I was using OpenBSD at the time, but referred to macOS for some stuff too.
Although fleng wasn't set up to build that way, modern best practice (on UNIX at least) seems to be to preprocess through cpp. Some of the headers in /usr/include are safe to include in an assembler, e.g. probably the first place you should start is <machine/asm.h>. This defines some very useful macros like ENTRY/END (to define functions). Also on OpenBSD, retguard support with RETGUARD_SETUP & RETGUARD_CHECK.
Note that the file extension is a capital S for assembler that is expected to run through cpp.
The macOS equivalent, mach/arm64/asm.h, has similar but slightly incompatible functionality. ENTRY is now a gas and not cpp macro, but this is easily papered over. Usefully, CALL_EXTERN is defined to handle the underscore prefix on that platform too (ENTRY already does this).
A second technique I liked is to use "structured assembler" macros. These can replace unstructured branches with _if ... _endif pairs, etc., but generate the exact same machine code. Feels like updating BASIC code from the 1980s :-)
Maybe this is a niche nowadays, but it does seem to pay off for compilers. Using the common "convert to C" approach may be filled with undefined behaviour and optimizer landmines.