Abstract Background Mental workload is a critical consideration in complex man–machine systems design. Among various mental workload detection techniques, multimodal detection techniques integrating electroencephalogram (EEG) and functional near-infrared spectroscopy (fNIRS) signals have attracted considerable attention. However, existing EEG–fNIRS-based mental workload detection methods have certain defects, such as complex signal acquisition channels and low detection accuracy, which restrict their practical application. Methods The signal acquisition configuration was optimized by analyzing the feature importance in mental workload recognition model and a more accurate and convenient EEG–fNIRS-based mental workload detection method was constructed. A classical Multi-Task Attribute Battery (MATB) task was conducted with 20 participating volunteers. Subjective scale data, 64-channel EEG data, and two-channel fNIRS data were collected. Results A higher number of EEG channels correspond to higher detection accuracy. However, there is no obvious improvement in accuracy once the number of EEG channels reaches 26, with a four-level mental workload detection accuracy of 76.25 ± 5.21%. Partial results of physiological analysis verify the results of previous studies, such as that the θ power of EEG and concentration of O 2 Hb in the prefrontal region increase while the concentration of HHb decreases with task difficulty. It was further observed, for the first time, that the energy of each band of EEG signals was significantly different in the occipital lobe region, and the power of $$beta_1$$ β 1 and $$beta_2$$ β 2 bands in the occipital region increased significantly with task difficulty. The changing range and the mean amplitude of O 2 Hb in high-difficulty tasks were significantly higher compared with those in low-difficulty tasks. Conclusions The channel configuration of EEG–fNIRS-based mental workload detection was optimized to 26 EEG channels and two frontal fNIRS channels. A four-level mental workload detection accuracy of 76.25 ± 5.21% was obtained, which is higher than previously reported results. The proposed configuration can promote the application of mental workload detection technology in military, driving, and other complex human–computer interaction systems.